2026-03-14 20:23:47
Flow diagrams, working explanations, and trade-offs for HTTP, Polling, Long Polling, WebSockets, SSE, WebRTC, gRPC-Web, and GraphQL Subscriptions.
Client ◄──────────────────────────────────► Server
HTTP Request → Response (one-shot)
Short Poll Request → Response → wait → repeat
Long Poll Request → ...hangs... → Response → repeat
SSE Request → Stream ←←←←← (server pushes)
WebSocket Handshake → ◄══ Full-duplex ══►
WebRTC Signaling → ◄══ Peer-to-Peer ══►
| Need | Best Fit |
|---|---|
| Fetch data once | HTTP REST / GraphQL |
| Near-real-time updates (seconds OK) | Short Polling |
| Real-time, server → client only | SSE |
| Real-time, bidirectional | WebSocket |
| Ultra-low latency media/P2P | WebRTC |
| Strongly-typed RPCs from browser | gRPC-Web |
HTTP (HyperText Transfer Protocol) is a request-response protocol. The client sends a request; the server sends back exactly one response. The connection is stateless by default.
GET, POST, etc.), headers, and optionally a body.200 OK, 404 Not Found), response headers, and a body (JSON, HTML, etc.).keep-alive, the same TCP connection can be reused for subsequent requests, reducing handshake overhead.| Version | Year | Key Feature |
|---|---|---|
| HTTP/1.0 | 1996 | One request per TCP connection |
| HTTP/1.1 | 1997 | Keep-alive, pipelining, chunked transfer |
| HTTP/2 | 2015 | Multiplexing, header compression (HPACK), server push |
| HTTP/3 | 2022 | QUIC (UDP-based), zero-RTT handshake, no head-of-line blocking |
HTTP/1.1 HTTP/2 HTTP/3
┌──────┐ ┌──────┐ ┌──────┐
│ Req1 │──┐ │Req1 │──┐ │Req1 │──┐
│ Res1 │◄─┘ │Req2 │ │ MUX │Req2 │ │ MUX over
│ Req2 │──┐ │Req3 │ │ over │Req3 │ │ QUIC (UDP)
│ Res2 │◄─┘ │Res2 │◄─┘ TCP │Res1 │◄─┘
│ Req3 │──┐ │Res1 │ │Res3 │
│ Res3 │◄─┘ │Res3 │ │Res2 │
└──────┘ └──────┘ └──────┘
Sequential Multiplexed No HOL blocking
// GET request
const res = await fetch('/api/posts');
const data = await res.json();
// POST request
await fetch('/api/posts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ title: 'Hello' })
});
| Pros | Cons |
|---|---|
| Simple, well-understood | No server push (HTTP/1.1) |
| Cacheable (ETags, Cache-Control) | New connection overhead (HTTP/1.1) |
| Stateless → easy to scale | Not suitable for real-time |
| Wide tooling support | Head-of-line blocking (HTTP/1.1, partially HTTP/2) |
The client sends repeated HTTP requests at a fixed interval to check for new data. The server responds immediately — whether or not there's new data.
Client Server
│── GET /updates ──────────►│
│◄── 200 { data: [] } ──────│ (no new data)
│ │
│ ... wait 5 seconds ... │
│ │
│── GET /updates ──────────►│
│◄── 200 { data: [msg1] } ──│ (new data!)
│ │
│ ... wait 5 seconds ... │
│ │
│── GET /updates ──────────►│
│◄── 200 { data: [] } ──────│ (no new data)
setInterval (or similar) triggers a fetch request every N seconds (e.g., every 5s).{ data: [] }).Trade-off: The maximum delay before the client sees new data equals the polling interval. Shorter intervals = more responsive but more server load. Longer intervals = less load but stale data.
function startPolling(url, interval = 5000) {
const poll = async () => {
const res = await fetch(url);
const data = await res.json();
if (data.length > 0) handleNewData(data);
};
poll();
return setInterval(poll, interval);
}
// Start & stop
const timerId = startPolling('/api/notifications', 3000);
clearInterval(timerId); // stop
| Pros | Cons |
|---|---|
| Simplest to implement | Wastes bandwidth (empty responses) |
| Works everywhere (plain HTTP) | Latency = up to interval delay |
| Easy to debug | Hammers the server at scale |
| Stateless | Not truly real-time |
The client sends a request, and the server holds the connection open until new data is available (or a timeout occurs). Once the client gets a response, it immediately sends a new request.
Client Server
│── GET /updates ───────────────►│
│ │ (server holds connection)
│ ... waiting ... │
│ │ ← new data arrives!
│◄── 200 { data: [msg1] } ──────│
│ │
│── GET /updates ───────────────►│ (immediately reconnect)
│ │ (server holds again...)
│ ... waiting ... │
│ (timeout 30s) │
│◄── 204 No Content ────────────│
│ │
│── GET /updates ───────────────►│ (reconnect)
GET request, often including a Last-Event-ID or timestamp so the server knows what the client has already seen.204 No Content (or an empty body) so the connection doesn't hang forever. Proxies and load balancers also have timeouts that must be respected.Key difference from Short Polling: instead of the client repeatedly asking "anything new?", the server answers only when there IS something new — drastically reducing empty responses.
async function longPoll(url) {
while (true) {
try {
const res = await fetch(url);
if (res.status === 200) {
const data = await res.json();
handleNewData(data);
}
// 204 = timeout, no data → just reconnect
} catch (err) {
await new Promise(r => setTimeout(r, 3000)); // backoff on error
}
}
}
| Pros | Cons |
|---|---|
| Near real-time delivery | Server holds open connections (resource cost) |
| Works through most proxies/firewalls | More complex than short polling |
| Fewer empty responses than short polling | Timeouts need careful handling |
| Universal browser support | Ordering/duplicate issues possible |
SSE is a unidirectional protocol where the server pushes data to the client over a single, long-lived HTTP connection. Built on top of HTTP — uses text/event-stream content type. The browser provides a native EventSource API.
Client Server
│── GET /events ────────────────────►│
│◄── HTTP 200 │
│◄── Content-Type: text/event-stream │
│ │
│◄── data: {"msg": "hello"} │ ← push
│ │
│◄── data: {"msg": "world"} │ ← push
│ │
│◄── event: alert │ ← named event
│◄── data: {"level": "critical"} │
│ │
│ (connection stays open...) │
EventSource object pointing to a URL. The browser sends a standard HTTP GET request to that endpoint.Content-Type: text/event-stream and keeps the connection open. It does NOT close the response — instead it writes data incrementally.data: ...\n\n) to the open response. Each double newline (\n\n) marks the end of one event.EventSource API fires onmessage for default events, or custom event listeners for named events (e.g., event: notification).retry: field). On reconnection, the browser sends a Last-Event-ID header with the last received id: value so the server can resume from where the client left off.source.close().Event Stream Format:
data: Simple text message\n\n
event: notification
id: 42
data: {"type": "friend_request"}\n\n
retry: 5000
data: — The payload (required)event: — Custom event name (default is "message")id: — Event ID (used for auto-reconnection with Last-Event-ID header)retry: — Reconnection interval in msconst source = new EventSource('/api/events');
source.onmessage = (event) => {
const data = JSON.parse(event.data);
console.log('New message:', data);
};
source.addEventListener('notification', (event) => {
showNotification(JSON.parse(event.data));
});
source.onerror = () => console.error('SSE error');
// EventSource auto-reconnects — no manual retry needed
source.close(); // close when done
| Pros | Cons |
|---|---|
Native browser API (EventSource) |
Unidirectional only (server → client) |
| Auto-reconnection built-in | Max 6 connections per domain (HTTP/1.1) |
| Works over standard HTTP | No binary data (text only) |
| Lightweight, simple protocol | Less browser support than WebSocket for older browsers |
| Event IDs for reliable delivery | |
| Works with HTTP/2 multiplexing |
WebSocket is a full-duplex, bidirectional communication protocol. It starts as an HTTP request (upgrade handshake), then switches to a persistent TCP connection where both client and server can send messages at any time.
Client Server
│── GET /chat HTTP/1.1 ──────────────►│
│ Upgrade: websocket │
│ Connection: Upgrade │
│ Sec-WebSocket-Key: dGhlIHNh... │
│ │
│◄── HTTP 101 Switching Protocols ────│
│ Upgrade: websocket │
│ Sec-WebSocket-Accept: s3pPLM... │
│ │
│◄════════ Full Duplex ═══════════════►│
│ │
│──► {"type":"msg","text":"Hi"} │
│◄── {"type":"msg","text":"Hello!"} │
│◄── {"type":"typing","user":"Bob"} │
│──► {"type":"msg","text":"How?"} │
│ │
│──► Close Frame ─────────────────────│
│◄── Close Frame ─────────────────────│
GET request with special headers: Upgrade: websocket, Connection: Upgrade, and a random Sec-WebSocket-Key. This goes over the same port as HTTP (80/443).HTTP 101 Switching Protocols and a Sec-WebSocket-Accept header (computed from the client's key + a magic GUID). This proves the server understands the WebSocket protocol.onclose event fires and the client must manually reconnect (ideally with exponential backoff + jitter to avoid thundering herd).1000.const ws = new WebSocket('wss://api.example.com/ws');
ws.onopen = () => ws.send(JSON.stringify({ type: 'join', room: 'general' }));
ws.onmessage = (event) => {
const data = JSON.parse(event.data);
// handle data.type: 'message', 'typing', 'presence', etc.
};
ws.onclose = (event) => {
if (!event.wasClean) setTimeout(() => reconnect(), 3000);
};
ws.close(1000, 'Done'); // clean close
WebSocket does not auto-reconnect. Production apps must implement:
Connection drops → onclose fires
│
▼
Start reconnection loop:
Attempt 1 → wait 1s + random jitter
Attempt 2 → wait 2s + random jitter
Attempt 3 → wait 4s + random jitter (exponential backoff)
...
Attempt N → wait min(2^N seconds, 30s max cap)
│
▼
On successful reconnect:
• Reset retry counter
• Flush any queued messages (buffered while offline)
• Re-subscribe to rooms/channels
│
▼
After max retries exceeded:
• Show error UI to user
• Optionally fall back to polling
┌──────────┐ ┌──────────────┐
│ Client │◄═══ WS ═══════►│ WS Server 1 │──┐
└──────────┘ └──────────────┘ │ ┌─────────┐
├──►│ Redis │
┌──────────┐ ┌──────────────┐ │ │ Pub/Sub │
│ Client │◄═══ WS ═══════►│ WS Server 2 │──┘ └─────────┘
└──────────┘ └──────────────┘
▲ ▲
└── Sticky Sessions ────────┘
(IP hash / cookie)
| Pros | Cons |
|---|---|
| True bidirectional, full-duplex | More complex server infrastructure |
| Very low latency | Stateful → harder to scale horizontally |
| Binary + text data support | No auto-reconnect (must implement) |
| Single TCP connection | Proxy/firewall issues (some block WS) |
| Efficient for high-frequency messages | No built-in request-response semantics |
| Wide browser support | Memory cost per connection on server |
WebRTC (Web Real-Time Communication) enables peer-to-peer audio, video, and arbitrary data transfer directly between browsers, with minimal server involvement (signaling only).
Browser A Signaling Server Browser B
│ │ │
│── Offer (SDP) ──────────►│ │
│ │── Offer (SDP) ───────────►│
│ │ │
│ │◄── Answer (SDP) ──────────│
│◄── Answer (SDP) ─────────│ │
│ │ │
│── ICE Candidates ───────►│── ICE Candidates ─────────►│
│◄── ICE Candidates ───────│◄── ICE Candidates ─────────│
│ │ │
│◄═══════════ Direct P2P Connection ══════════════════► │
│ (audio / video / data) │
Signaling (via a server) — Before peers can talk directly, they need to exchange connection metadata. This is done through a signaling server (using WebSocket, HTTP, or any transport). The signaling server does NOT relay media — it only relays setup messages.
SDP Offer/Answer — Peer A creates an SDP (Session Description Protocol) offer describing its capabilities: supported codecs, media types, encryption parameters. This offer is sent to Peer B via the signaling server. Peer B responds with an SDP answer confirming which capabilities it supports.
ICE Candidate Gathering — Both peers simultaneously discover their own network addresses using ICE (Interactive Connectivity Establishment):
Each discovered candidate is sent to the other peer via the signaling server.
Connectivity checks — ICE performs connectivity checks on all candidate pairs (Peer A's candidates × Peer B's candidates) to find the best working path. It prioritizes direct connections over relayed ones.
Direct P2P connection established — Once a working candidate pair is found, a direct connection is established between the two browsers. All subsequent data flows peer-to-peer, bypassing the server entirely.
Secure media/data transfer — Audio and video are encrypted with SRTP, data channels use DTLS. Encryption is mandatory in WebRTC — there is no unencrypted mode.
For large groups — Pure P2P doesn't scale (N users = N×(N-1)/2 connections). For group calls, architectures use:
// Create peer connection
const pc = new RTCPeerConnection({
iceServers: [{ urls: 'stun:stun.l.google.com:19302' }]
});
// Create data channel (or add media tracks)
const channel = pc.createDataChannel('chat');
channel.onmessage = (e) => console.log(e.data);
// Create and send offer
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
// → send offer to remote peer via signaling server
// Receive answer from remote peer
await pc.setRemoteDescription(remoteAnswer);
// Exchange ICE candidates
pc.onicecandidate = (e) => {
if (e.candidate) sendToRemotePeer(e.candidate);
};
| Pros | Cons |
|---|---|
| Peer-to-peer (low latency) | Complex setup (ICE, STUN, TURN) |
| Supports audio, video, data | Firewall/NAT traversal issues |
| Encrypted by default (DTLS/SRTP) | Higher battery usage on mobile |
| Reduces server bandwidth costs | Doesn't scale for large groups (need SFU/MCU) |
Most devices sit behind NATs (Network Address Translators) or firewalls. WebRTC needs to discover a path between two peers — that's where ICE (Interactive Connectivity Establishment) comes in.
┌─────────────────────────────────────────────────────────────────┐
│ ICE Candidate Gathering │
├─────────────────────────────────────────────────────────────────┤
│ │
│ 1. Host Candidates — Local IP addresses (LAN) │
│ 2. Server Reflexive (srflx) — Public IP via STUN │
│ 3. Relay Candidates — Relayed via TURN (fallback) │
│ │
│ ICE tries candidates in order: Host → srflx → Relay │
│ (Fastest to slowest, cheapest to most expensive) │
└─────────────────────────────────────────────────────────────────┘
┌────────┐ "What's my public IP?" ┌────────────┐
│ Peer A │────────────────────────────►│ STUN Server │
│ (NAT) │◄────────────────────────────│ (Public) │
└────────┘ "Your IP is 203.0.113.5 └────────────┘
port 54321"
stun:stun.l.google.com:19302).┌────────┐ ┌─────────────┐ ┌────────┐
│ Peer A │═══ Encrypted ═══►│ TURN Server │═══ Encrypted ═══►│ Peer B │
│ (NAT) │◄═════════════════│ (Relay) │◄═════════════════│ (NAT) │
└────────┘ └─────────────┘ └────────┘
ICE orchestrates the entire process:
1. Gather all candidates (host, srflx via STUN, relay via TURN)
2. Exchange candidates with remote peer via signaling server
3. Pair local candidates with remote candidates
4. Run connectivity checks on each pair (STUN Binding Requests)
5. Select the best working pair (lowest latency, direct > relay)
6. Begin media/data transfer on the selected path
ICE States:
| State | Meaning |
|---|---|
new |
ICE agent created, no candidates gathered yet |
gathering |
Collecting local candidates (host, srflx, relay) |
checking |
Running connectivity checks on candidate pairs |
connected |
At least one working pair found |
completed |
All checks done, best pair selected |
failed |
No working pair found (connection impossible) |
disconnected |
Connectivity lost (may recover) |
closed |
ICE agent shut down |
peerConnection.oniceconnectionstatechange = () => {
console.log('ICE state:', peerConnection.iceConnectionState);
// 'checking' → 'connected' → 'completed' (happy path)
// 'checking' → 'failed' (no path found)
// 'connected' → 'disconnected' → 'connected' (network hiccup)
};
Trickle ICE vs Full ICE:
| Approach | Description | Speed |
|---|---|---|
| Full ICE | Gather ALL candidates first, then send offer/answer | Slower (waits for TURN) |
| Trickle ICE | Send candidates as they're discovered, incrementally | Faster (connection starts sooner) |
Trickle ICE is the modern default — candidates are sent via signaling as they appear.
SDP is the metadata format exchanged during WebRTC signaling. It describes the media capabilities of each peer.
v=0
o=- 4611731400430051location 2 IN IP4 127.0.0.1
s=-
t=0 0
m=audio 49170 RTP/SAVPF 111 103 104
c=IN IP4 203.0.113.5
a=rtpmap:111 opus/48000/2 ← Opus audio codec
a=fmtp:111 minptime=10;useinbandfec=1
a=candidate:0 1 UDP 2113667327 192.168.1.5 54321 typ host
a=candidate:1 1 UDP 1694498815 203.0.113.5 54321 typ srflx raddr 192.168.1.5 rport 54321
m=video 51372 RTP/SAVPF 96 97
a=rtpmap:96 VP8/90000 ← VP8 video codec
a=rtpmap:97 H264/90000 ← H264 video codec
Key SDP Fields:
| Field | Meaning |
|---|---|
m= |
Media line (audio, video, or application for data channels) |
a=rtpmap: |
Codec mapping (which codecs the peer supports) |
a=candidate: |
ICE candidate (IP, port, type) |
a=fingerprint: |
DTLS certificate fingerprint (for encryption verification) |
a=ice-ufrag / a=ice-pwd
|
ICE credentials for connectivity checks |
Offer/Answer Model:
Peer A creates Offer SDP → "I can do Opus audio + VP8/H264 video"
Sends via signaling server
Peer B receives Offer → "I can do Opus audio + VP8 video (no H264)"
Peer B creates Answer SDP → "Let's use Opus + VP8"
Sends via signaling server
Peer A receives Answer → Negotiation complete, start media
// ─── Getting User Media ───
const stream = await navigator.mediaDevices.getUserMedia({
video: { width: 1280, height: 720, frameRate: 30 },
audio: { echoCancellation: true, noiseSuppression: true }
});
// ─── Screen Sharing ───
const screenStream = await navigator.mediaDevices.getDisplayMedia({
video: { cursor: 'always' },
audio: true // system audio (if supported)
});
// ─── Adding Tracks to Peer Connection ───
stream.getTracks().forEach(track => {
peerConnection.addTrack(track, stream);
});
// ─── Receiving Remote Tracks ───
peerConnection.ontrack = (event) => {
const [remoteStream] = event.streams;
remoteVideoElement.srcObject = remoteStream;
};
// ─── Transceivers (fine-grained control) ───
const transceiver = peerConnection.addTransceiver('video', {
direction: 'sendrecv', // 'sendonly', 'recvonly', 'inactive'
sendEncodings: [
{ rid: 'high', maxBitrate: 2500000 }, // Simulcast layers
{ rid: 'mid', maxBitrate: 500000, scaleResolutionDownBy: 2 },
{ rid: 'low', maxBitrate: 150000, scaleResolutionDownBy: 4 }
]
});
WebRTC is P2P by design, but group calls need different topologies:
┌─────────────────────────── MESH ──────────────────────────────┐
│ │
│ A ◄══════► B Each peer connects to EVERY other │
│ ▲ ╲ ╱ ▲ peer. N peers = N×(N-1)/2 connections│
│ ║ ╲ ╱ ║ Good for: 2-4 participants │
│ ║ ╲╱ ║ Bad for: 5+ (CPU/bandwidth explodes) │
│ ▼ ╱ ╲ ▼ │
│ C ◄══════► D │
└────────────────────────────────────────────────────────────────┘
┌─────────────────────────── SFU ───────────────────────────────┐
│ (Selective Forwarding Unit) │
│ │
│ A ──send──► ┌─────┐ ──forward──► B │
│ B ──send──► │ SFU │ ──forward──► A │
│ C ──send──► │ │ ──forward──► A, B, D │
│ D ──send──► └─────┘ ──forward──► A, B, C │
│ │
│ Each peer sends ONE stream to SFU. │
│ SFU selectively forwards to each recipient. │
│ No transcoding — just routing. Low server CPU. │
│ Used by: Zoom, Google Meet, Twilio, Jitsi │
└────────────────────────────────────────────────────────────────┘
┌─────────────────────────── MCU ───────────────────────────────┐
│ (Multipoint Conferencing Unit) │
│ │
│ A ──send──► ┌─────┐ │
│ B ──send──► │ MCU │ ──mixed stream──► A, B, C, D │
│ C ──send──► │ │ │
│ D ──send──► └─────┘ │
│ │
│ MCU decodes ALL streams, mixes them into ONE composite │
│ stream, re-encodes, and sends to each peer. │
│ Very CPU-intensive server. Low client bandwidth. │
│ Used by: Legacy video conferencing systems │
└────────────────────────────────────────────────────────────────┘
| Topology | Upload | Download | Server CPU | Best For |
|---|---|---|---|---|
| Mesh | N-1 streams | N-1 streams | None | 2-4 people, simple |
| SFU | 1 stream | N-1 streams | Low (routing) | 5-50+ people |
| MCU | 1 stream | 1 stream | Very High (transcoding) | Low-bandwidth clients |
┌────────────────────────────────────────────────────────────────┐
│ WebRTC Security Stack │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ ┌─────────────────────────────────────┐ │
│ │ Data Channels │ │ Media (Audio/Video) │ │
│ │ ───────────── │ │ ───────────────────── │ │
│ │ SCTP over DTLS │ │ SRTP (Secure RTP) │ │
│ │ │ │ Keys exchanged via DTLS │ │
│ └────────┬───────┘ └──────────────┬──────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ DTLS (Datagram Transport Layer Security) │ │
│ │ • TLS-like encryption for UDP │ │
│ │ • Certificate fingerprints verified via SDP │ │
│ │ • Mutual authentication (both peers verify) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ ICE + UDP/TCP Transport │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ Key Points: │
│ • ALL WebRTC communication is encrypted by default │
│ • Cannot be disabled — encryption is mandatory │
│ • SRTP keys derived from DTLS handshake (no external KMS) │
│ • getUserMedia requires HTTPS (secure context) or localhost │
│ • User must grant explicit permission for camera/mic │
└────────────────────────────────────────────────────────────────┘
┌──────────┐ ┌──────────────┐ ┌──────────┐
│ Peer A │ │ Signaling │ │ Peer B │
│ │ │ Server │ │ │
│ 1. Create│ │ (WebSocket/ │ │ │
│ PeerConn │ HTTP) │ │ │
│ │ │ │ │ │
│ 2. getUserMedia() │ │ │ │
│ (camera+mic) │ │ │ │
│ │ │ │ │ │
│ 3. addTrack() │ │ │ │
│ │ │ │ │ │
│ 4. createOffer() │ │ │ │
│ 5. setLocalDesc() │ │ │ │
│ │ │ │ │ │
│ 6. ──Offer SDP────►│──Offer SDP───►│ │ │
│ │ │ │ 7. setRemoteDesc() │
│ │ │ │ 8. createAnswer() │
│ │ │ │ 9. setLocalDesc() │
│ │ │ │ │ │
│ │◄─Answer SDP─│◄─Answer SDP──│ │
│ 10. setRemoteDesc() │ │ │ │
│ │ │ │ │ │
│ 11. ICE candidates ◄═══════════════► ICE candidates │
│ (trickled via signaling) │ │
│ │ │ │ │ │
│ 12. DTLS Handshake ◄═════ P2P ═════► DTLS Handshake │
│ │ │ │ │ │
│ 13. ◄════ SRTP Media + SCTP Data ════► │ │
│ │ │ │ │ │
│ 🎉 Connected! │ │ 🎉 Connected! │
└──────────┘ └──────────────┘ └──────────┘
gRPC-Web allows browsers to call gRPC services using Protocol Buffers. It provides strongly-typed, efficient RPC communication but requires a proxy (like Envoy) since browsers can't do native HTTP/2 gRPC.
Browser ──► gRPC-Web ──► Envoy Proxy ──► gRPC Server
(HTTP/1.1 (translates (HTTP/2
or HTTP/2) to gRPC) native)
Define service contracts — Services are defined in .proto files using Protocol Buffers (protobuf). These define the RPC methods, request types, and response types in a language-neutral schema.
Code generation — The .proto file is compiled into client-side JavaScript/TypeScript stubs using protoc with the gRPC-Web plugin. This generates typed request/response classes and service client methods — no manual HTTP calls needed.
Client makes RPC call — The browser calls the generated client method (e.g., client.getUser(request, ...)). Under the hood, gRPC-Web serializes the request into a compact binary protobuf format and sends it as an HTTP/1.1 or HTTP/2 request with Content-Type: application/grpc-web.
Proxy translates — Browsers cannot speak native gRPC (which requires HTTP/2 trailers and full-duplex streaming). An Envoy proxy (or similar) sits between the browser and the gRPC server, translating the gRPC-Web request into a native gRPC request.
Server processes — The backend gRPC server processes the request and responds in native gRPC format. The proxy translates it back to gRPC-Web format for the browser.
Streaming support — gRPC-Web supports server streaming (server sends a stream of messages in response to a single request). However, client streaming and bidirectional streaming are NOT supported in the browser due to HTTP limitations.
// user.proto — Service definition
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User);
}
// Client call (using generated stubs)
const request = new GetUserRequest();
request.setId('user-123');
client.getUser(request, {}, (err, response) => {
console.log(response.getName());
});
| Pros | Cons |
|---|---|
| Strongly typed (protobuf) | Requires proxy (Envoy) |
| Compact binary format | Not human-readable (debugging harder) |
| Code generation | Limited browser support (no bidi streaming) |
| Server streaming support | Steeper learning curve |
Protobuf is the serialization format used by gRPC. It's a binary format that is ~5-10x smaller and ~20-100x faster to parse than JSON.
┌──────────────────────── JSON vs Protobuf ────────────────────┐
│ │
│ JSON (text, 82 bytes): │
│ {"id":"user-123","name":"Alice","email":"[email protected]",│
│ "age":30,"active":true} │
│ │
│ Protobuf (binary, ~28 bytes): │
│ 0a 08 75 73 65 72 2d 31 32 33 12 05 41 6c 69 63 65 ... │
│ │
│ Savings: ~66% smaller payload │
└───────────────────────────────────────────────────────────────┘
How Protobuf Encoding Works:
message User {
string id = 1; // Field number 1 → tag = (1 << 3) | 2 = 0x0a
string name = 2; // Field number 2 → tag = (2 << 3) | 2 = 0x12
string email = 3; // Field number 3
int32 age = 4; // Varint encoding (30 = 0x1e, just 1 byte!)
bool active = 5; // 1 byte (0 or 1)
}
Key Protobuf Rules:
reserved to prevent reuse."" for strings, false for bools (not serialized → saves space).gRPC supports 4 communication patterns:
┌──────────────────────────────────────────────────────────────┐
│ gRPC Communication Patterns │
├──────────────────────────────────────────────────────────────┤
│ │
│ 1. UNARY (Request-Response) │
│ Client ──Request──► Server │
│ Client ◄──Response── Server │
│ Like a regular REST call. │
│ │
│ 2. SERVER STREAMING │
│ Client ──Request──► Server │
│ Client ◄──Stream──── Server (multiple responses) │
│ Example: Stock price feed, log tailing. │
│ │
│ 3. CLIENT STREAMING │
│ Client ──Stream──► Server (multiple requests) │
│ Client ◄──Response── Server │
│ Example: File upload, sensor data ingestion. │
│ │
│ 4. BIDIRECTIONAL STREAMING ⚠️ NOT supported in gRPC-Web │
│ Client ◄══Stream══► Server (both sides stream) │
│ Example: Chat, real-time collaboration. │
│ │
└──────────────────────────────────────────────────────────────┘
// All 4 patterns in proto definition
service ChatService {
rpc GetMessage (GetMessageRequest) returns (Message); // Unary
rpc ListMessages (ListRequest) returns (stream Message); // Server streaming
rpc UploadMessages (stream Message) returns (UploadResponse); // Client streaming
rpc Chat (stream ChatMessage) returns (stream ChatMessage); // Bidirectional
}
// ─── Server Streaming Example (gRPC-Web supported) ───
const stream = client.listMessages(new ListRequest());
stream.on('data', (message) => {
console.log('Received:', message.getText());
appendToUI(message);
});
stream.on('status', (status) => {
console.log('Stream status:', status.code, status.details);
});
stream.on('end', () => {
console.log('Stream completed');
});
| Feature | Native gRPC | gRPC-Web |
|---|---|---|
| Transport | HTTP/2 (native) | HTTP/1.1 or HTTP/2 |
| Requires proxy | No | Yes (Envoy, grpc-web-proxy) |
| Unary | ✅ | ✅ |
| Server streaming | ✅ | ✅ |
| Client streaming | ✅ | ❌ |
| Bidi streaming | ✅ | ❌ |
| Browser support | ❌ (cannot do raw HTTP/2) | ✅ |
| Binary format | Protobuf | Protobuf or base64 text |
| Used in | Backend-to-backend | Browser-to-backend |
Why browsers can't do native gRPC:
fetch() and XMLHttpRequest abstract away the transport layer.// ─── gRPC Status Codes (subset relevant to frontend) ───
const grpcStatusCodes = {
0: 'OK', // Success
1: 'CANCELLED', // Client cancelled the request
2: 'UNKNOWN', // Unknown error
3: 'INVALID_ARGUMENT', // Bad request (like HTTP 400)
4: 'DEADLINE_EXCEEDED', // Timeout (like HTTP 408)
5: 'NOT_FOUND', // Resource not found (like HTTP 404)
7: 'PERMISSION_DENIED', // Forbidden (like HTTP 403)
13: 'INTERNAL', // Server error (like HTTP 500)
14: 'UNAVAILABLE', // Service unavailable (like HTTP 503)
16: 'UNAUTHENTICATED', // Not authenticated (like HTTP 401)
};
// ─── Error Handling in gRPC-Web ───
client.getUser(request, metadata, (err, response) => {
if (err) {
switch (err.code) {
case 3: // INVALID_ARGUMENT
showValidationError(err.message);
break;
case 14: // UNAVAILABLE
retryWithBackoff(() => client.getUser(request, metadata, callback));
break;
case 16: // UNAUTHENTICATED
redirectToLogin();
break;
default:
showGenericError(err.message);
}
return;
}
renderUser(response);
});
// ─── Client-side Interceptor for Auth + Logging ───
class AuthInterceptor {
intercept(request, invoker) {
// Add auth metadata to every request
const metadata = request.getMetadata();
metadata['authorization'] = `Bearer ${getToken()}`;
metadata['x-request-id'] = crypto.randomUUID();
const start = performance.now();
return invoker(request).then((response) => {
const duration = performance.now() - start;
console.log(`gRPC ${request.getMethodDescriptor().name}: ${duration}ms`);
return response;
}).catch((err) => {
if (err.code === 16) { // UNAUTHENTICATED
return refreshToken().then(() => invoker(request)); // retry
}
throw err;
});
}
}
// Apply interceptor
const client = new UserServiceClient('http://localhost:8080', null, {
unaryInterceptors: [new AuthInterceptor()],
streamInterceptors: [new AuthInterceptor()]
});
┌────────────────────── Proxy-Based (L7) ──────────────────────┐
│ │
│ Browser ──► Envoy Proxy ──► gRPC Server 1 │
│ ──► gRPC Server 2 │
│ ──► gRPC Server 3 │
│ │
│ Envoy understands gRPC protocol, can do: │
│ • Round-robin / least-connections │
│ • gRPC health checking │
│ • Per-RPC load balancing (not per-connection!) │
│ • Retries with status code awareness │
│ • Circuit breaking │
│ │
│ ⚠️ Plain TCP load balancers (L4) fail with gRPC because │
│ HTTP/2 multiplexes all RPCs over one TCP connection. │
│ L4 LB sends ALL traffic to one backend! │
└───────────────────────────────────────────────────────────────┘
GraphQL Subscriptions enable real-time data via GraphQL. Under the hood, they typically use WebSockets (via graphql-ws protocol) to push updates when subscribed data changes.
Client Server
│── Subscribe: subscription { │
│ messageAdded(room: "general") │
│ { id text author } │
│ } ─────────────────────────────►│
│ │
│◄── { messageAdded: { │ ← push
│ id: 1, text: "Hi", │
│ author: "Alice" │
│ }} │
│ │
│◄── { messageAdded: { │ ← push
│ id: 2, text: "Hello!", │
│ author: "Bob" │
│ }} │
Transport setup — The client establishes a WebSocket connection to the GraphQL server (using the graphql-ws or older subscriptions-transport-ws protocol). Regular queries/mutations continue over HTTP; only subscriptions use the WebSocket link.
Client subscribes — The client sends a subscription operation (just like a query, but with the subscription keyword). It specifies exactly which fields it wants — GraphQL's power of client-driven data shape applies here too.
Server registers the subscription — The server parses the subscription, validates it against the schema, and registers a listener. Internally, this often hooks into a Pub/Sub system (Redis, Kafka, or in-memory) that watches for relevant events.
Event triggers push — When the subscribed data changes (e.g., a new message is created via a mutation), the server's Pub/Sub system fires an event. The subscription resolver runs, resolves the data into the exact shape the client requested, and pushes it over the WebSocket.
Client receives typed data — The pushed data arrives in the same shape as a normal GraphQL response. It integrates seamlessly with the client's GraphQL cache (e.g., Apollo's InMemoryCache), so the UI updates automatically.
Unsubscribe — The client can stop listening by closing the subscription (or the entire WebSocket connection). The server de-registers the listener and frees resources.
# Subscription definition
subscription OnMessageAdded($room: String!) {
messageAdded(room: $room) {
id
text
author
createdAt
}
}
// Client usage (Apollo)
const { data, loading } = useSubscription(MESSAGE_SUBSCRIPTION, {
variables: { room: 'general' }
});
| Pros | Cons |
|---|---|
| Unified API (queries + mutations + subscriptions) | Adds WebSocket complexity |
| Client specifies exact data shape | Scalability challenges |
| Integrates with GraphQL cache | Overhead if only subscriptions are needed |
| Type-safe with codegen |
How do microservices talk to each other — and how does the frontend fit into the picture?
┌────────────────────────────────────────────────────────────────────┐
│ Microservice Communication Spectrum │
├────────────────────────────────────────────────────────────────────┤
│ │
│ SYNCHRONOUS (Request-Response) ASYNCHRONOUS (Event-Driven) │
│ ───────────────────────────── ──────────────────────────── │
│ • REST (HTTP/JSON) • Message Queues (RabbitMQ) │
│ • gRPC (HTTP/2 + Protobuf) • Event Streams (Kafka) │
│ • GraphQL • Pub/Sub (Redis, NATS) │
│ • WebSocket (real-time sync) • Webhooks │
│ │
│ Caller WAITS for response Caller sends & moves on │
│ Tight coupling in time Loose coupling │
│ Simpler to reason about Better fault isolation │
│ Cascading failures possible Eventual consistency │
│ │
└────────────────────────────────────────────────────────────────────┘
| Aspect | REST | gRPC | GraphQL |
|---|---|---|---|
| Format | JSON (text) | Protobuf (binary) | JSON (text) |
| Transport | HTTP/1.1 or HTTP/2 | HTTP/2 (native) | HTTP (any version) |
| Contract | OpenAPI/Swagger |
.proto files |
Schema (SDL) |
| Code generation | Optional (OpenAPI codegen) | Built-in (protoc) | Built-in (codegen) |
| Streaming | No (use SSE/WS) | 4 types (unary, server, client, bidi) | Subscriptions (WS) |
| Payload size | Large (verbose keys) | Small (binary, no keys) | Medium (query-shaped) |
| Browser support | ✅ Native | ❌ Needs proxy (gRPC-Web) | ✅ Native |
| Best for | Public APIs, CRUD | Internal microservices | Frontend-facing APIs |
| Latency | Medium | Low | Medium |
| Learning curve | Low | Medium-High | Medium |
┌──────────────────── Typical Architecture ───────────────────┐
│ │
│ Browser/App │
│ │ │
│ │ REST / GraphQL / gRPC-Web │
│ ▼ │
│ ┌──────────────┐ │
│ │ API Gateway │ (or BFF — Backend for Frontend) │
│ │ / BFF │ │
│ └──────┬───────┘ │
│ │ │
│ ┌────┼──────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌────┐ ┌────┐ ┌──────────┐ │
│ │Svc │ │Svc │ │ Svc C │ Between services: │
│ │ A │ │ B │ │ │ • gRPC (fast, typed) │
│ └────┘ └────┘ └──────────┘ • REST (simple, universal) │
│ │ │ │ • Events (async, decoupled) │
│ └───────┼──────────┘ │
│ ▼ │
│ Message Broker │
│ (Kafka / RabbitMQ / NATS) │
└──────────────────────────────────────────────────────────────┘
Producer ──► Exchange ──► Queue ──► Consumer
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Order │────►│ Exchange │────►│ Queue │────►│ Payment │
│ Service │ │ (routing)│ │ │ │ Service │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
Producers ──► Topic (Partitions) ──► Consumer Groups
┌──────────┐ ┌─────────────────────┐ ┌──────────────┐
│ Order │─────►│ Topic: orders │─────►│ Payment Svc │
│ Service │ │ ┌─P0─┐ ┌─P1─┐ ┌─P2─┐│ │ (Group A) │
└──────────┘ │ │msg1│ │msg2│ │msg3││ └──────────────┘
┌──────────┐ │ │msg4│ │msg5│ │msg6││ ┌──────────────┐
│ Inventory│─────►│ └────┘ └────┘ └────┘│─────►│ Analytics │
│ Service │ └─────────────────────┘ │ (Group B) │
└──────────┘ └──────────────┘
┌──────────┐ publish ┌──────┐ subscribe ┌──────────┐
│ Service A │──────────────►│ NATS │───────────────►│ Service B │
└──────────┘ │ │───────────────►│ Service C │
└──────┘ └──────────┘
| Feature | RabbitMQ | Kafka | NATS |
|---|---|---|---|
| Model | Message queue | Event log/stream | Pub/sub |
| Delivery | At-least-once | At-least-once | At-most-once (core) |
| Ordering | Per queue | Per partition | No guarantee |
| Persistence | Until consumed | Configurable retention | Optional (JetStream) |
| Throughput | ~50K msg/s | ~1M+ msg/s | ~10M+ msg/s |
| Replay | ❌ | ✅ | ✅ (JetStream) |
| Best for | Task queues | Event streaming | Real-time signaling |
┌────────────────────────────────────────────────────────────────┐
│ API Gateway │
├────────────────────────────────────────────────────────────────┤
│ │
│ Browser ──HTTP──► ┌──────────────┐ ──► User Service (gRPC) │
│ │ API Gateway │ ──► Order Service (gRPC) │
│ Mobile ──HTTP──► │ │ ──► Payment Service (REST) │
│ │ (Kong / │ ──► Notification (event) │
│ 3rd Party ──────► │ Nginx / │ │
│ │ AWS ALB) │ │
│ └──────────────┘ │
│ │
│ Responsibilities: │
│ ✅ Request routing (path-based, header-based) │
│ ✅ Authentication & authorization (JWT validation) │
│ ✅ Rate limiting & throttling │
│ ✅ Request/response transformation │
│ ✅ Load balancing │
│ ✅ Circuit breaking │
│ ✅ Caching │
│ ✅ Logging, metrics, tracing │
│ ✅ CORS handling │
│ ✅ SSL termination │
│ ✅ Protocol translation (REST ↔ gRPC) │
│ │
└────────────────────────────────────────────────────────────────┘
┌───────────┐ ┌─────────────┐
│ Web App │────►│ Web BFF │──┐
│ (React) │ │ (GraphQL) │ │
└───────────┘ └─────────────┘ │
│ ┌────────────┐
┌───────────┐ ┌─────────────┐ ├────►│ User Svc │
│ Mobile App │────►│ Mobile BFF │──┤ └────────────┘
│ (iOS/And) │ │ (REST, slim)│ │ ┌────────────┐
└───────────┘ └─────────────┘ ├────►│ Order Svc │
│ └────────────┘
┌───────────┐ ┌─────────────┐ │ ┌────────────┐
│ TV App │────►│ TV BFF │──┘ │ Product Svc│
│ │ │ (minimal) │───────►└────────────┘
└───────────┘ └─────────────┘
Why BFF?
┌──────────────────────────────────────────────────────────────┐
│ Service Mesh │
├──────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Service A │ │ Service B │ │
│ │ ┌─────────────┐ │ mTLS │ ┌─────────────┐ │ │
│ │ │ App Code │ │◄═════►│ │ App Code │ │ │
│ │ └──────┬──────┘ │ │ └──────┬──────┘ │ │
│ │ │ │ │ │ │ │
│ │ ┌──────▼──────┐ │ │ ┌──────▼──────┐ │ │
│ │ │ Sidecar │ │◄══════►│ │ Sidecar │ │ │
│ │ │ Proxy │ │ │ │ Proxy │ │ │
│ │ │ (Envoy) │ │ │ │ (Envoy) │ │ │
│ │ └─────────────┘ │ │ └─────────────┘ │ │
│ └──────────────────┘ └──────────────────┘ │
│ │
│ The app doesn't know about the mesh. │
│ Sidecar proxies handle ALL network concerns: │
│ │
│ • mTLS (mutual TLS) — automatic encryption between services │
│ • Load balancing — intelligent routing │
│ • Circuit breaking — stop calling failing services │
│ • Retries & timeouts — configurable per service │
│ • Observability — distributed tracing, metrics, access logs │
│ • Traffic shifting — canary deployments, A/B testing │
│ • Rate limiting — per-service or global │
│ │
│ Control Plane (Istiod): │
│ Manages configuration, pushes policies to all sidecars. │
│ │
└──────────────────────────────────────────────────────────────┘
Why it matters for frontend:
┌──────────────────────────────────────────────────────────────────────┐
│ Event-Driven Architecture │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ User places order on Web UI │
│ │ │
│ ▼ │
│ ┌──────────────┐ OrderCreated ┌──────────────────────────────┐ │
│ │ Order Service │ ───event──────► │ Event Bus │ │
│ └──────────────┘ │ (Kafka / RabbitMQ / NATS) │ │
│ └──────────┬──┬──┬────────────┘ │
│ │ │ │ │
│ ┌──────────────────────┘ │ └───────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Payment Svc │ │ Inventory Svc│ │ Email Svc│ │
│ │ (charge card)│ │ (reserve qty)│ │ (confirm)│ │
│ └──────┬───────┘ └──────────────┘ └──────────┘ │
│ │ │
│ ▼ │
│ PaymentCompleted event ──► more downstream actions │
│ │
│ Benefits: │
│ • Services are DECOUPLED (Order doesn't know about Payment) │
│ • Easy to ADD new consumers (e.g., add Analytics service) │
│ • FAULT TOLERANT (if Email is down, message stays in queue) │
│ • SCALABLE (each service scales independently) │
│ │
│ Challenges: │
│ • Eventual consistency (not immediate) │
│ • Debugging distributed flows is harder │
│ • Message ordering can be tricky │
│ • Need idempotent consumers (duplicate messages possible) │
│ │
└──────────────────────────────────────────────────────────────────────┘
In a monolith, a single DB transaction covers everything. In microservices, each service has its own DB. The Saga pattern coordinates multi-service transactions.
┌──────────────── Choreography Saga ─────────────────────────┐
│ │
│ Order Svc ──OrderCreated──► Payment Svc │
│ │ │
│ PaymentDone──► Inventory Svc │
│ │ │
│ ItemReserved──► Shipping │
│ │ │
│ ShipmentCreated│
│ │
│ If Payment FAILS: │
│ Payment Svc ──PaymentFailed──► Order Svc (cancel order) │
│ │
│ Each service reacts to events and publishes new events. │
│ No central coordinator. │
└─────────────────────────────────────────────────────────────┘
┌──────────────── Orchestration Saga ───────────────────────┐
│ │
│ ┌─────────────────┐ │
│ │ Saga Orchestrator│ (central coordinator) │
│ │ (Order Saga) │ │
│ └────────┬────────┘ │
│ │ │
│ ├──► Step 1: Payment Svc.charge() │
│ │ ✅ success │
│ ├──► Step 2: Inventory Svc.reserve() │
│ │ ❌ failure │
│ ├──► Compensate: Payment Svc.refund() │
│ └──► Mark saga as FAILED │
│ │
│ The orchestrator knows the full workflow and handles │
│ compensation (rollback) for each failed step. │
└────────────────────────────────────────────────────────────┘
| Approach | Pros | Cons |
|---|---|---|
| Choreography | Decoupled, no single point of failure | Hard to track, complex for many steps |
| Orchestration | Clear workflow, easier to debug | Central coordinator = potential bottleneck |
┌──────────────── Circuit Breaker States ─────────────────────┐
│ │
│ CLOSED (normal) │
│ ┌────────────────────┐ │
│ │ Requests flow │ │
│ │ through normally │──failure threshold exceeded──┐ │
│ └────────────────────┘ │ │
│ ▲ ▼ │
│ │ OPEN (blocked) │
│ │ ┌─────────────────┐ │
│ success in half-open │ ALL requests │ │
│ │ │ instantly fail │ │
│ │ │ (fail-fast) │ │
│ │ └────────┬────────┘ │
│ │ │ │
│ │ timeout elapsed │
│ │ │ │
│ │ ▼ │
│ ┌──────┴──────────────┐ HALF-OPEN │
│ │ HALF-OPEN │ ┌─────────────────┐ │
│ │ Allow ONE test req │◄─────────│ Let a few test │ │
│ │ If success → CLOSED │ │ requests through│ │
│ │ If failure → OPEN │ └─────────────────┘ │
│ └─────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘
// ─── Simple Circuit Breaker Implementation ───
class CircuitBreaker {
constructor(fn, options = {}) {
this.fn = fn;
this.failureThreshold = options.failureThreshold ?? 5;
this.resetTimeout = options.resetTimeout ?? 30000;
this.state = 'CLOSED'; // CLOSED | OPEN | HALF_OPEN
this.failureCount = 0;
this.lastFailureTime = null;
}
async call(...args) {
if (this.state === 'OPEN') {
if (Date.now() - this.lastFailureTime >= this.resetTimeout) {
this.state = 'HALF_OPEN'; // try one request
} else {
throw new Error('Circuit breaker is OPEN — request blocked');
}
}
try {
const result = await this.fn(...args);
this.onSuccess();
return result;
} catch (err) {
this.onFailure();
throw err;
}
}
onSuccess() {
this.failureCount = 0;
this.state = 'CLOSED';
}
onFailure() {
this.failureCount++;
this.lastFailureTime = Date.now();
if (this.failureCount >= this.failureThreshold) {
this.state = 'OPEN';
console.warn('Circuit breaker OPENED — too many failures');
}
}
}
// Usage
const fetchUsers = new CircuitBreaker(
() => fetch('/api/users').then(r => r.json()),
{ failureThreshold: 3, resetTimeout: 10000 }
);
try {
const users = await fetchUsers.call();
} catch (err) {
showFallbackUI(); // graceful degradation
}
┌──────────────────────────────────────────────────────────────────┐
│ Frontend ↔ Microservice Communication │
├──────────────────────────────────────────────────────────────────┤
│ │
│ PATTERN 1: API Gateway (most common) │
│ ─────────────────────────────────── │
│ Frontend ──REST/GraphQL──► API Gateway ──gRPC──► Services │
│ • Single entry point for all API calls │
│ • Gateway handles auth, rate limiting, routing │
│ • Frontend doesn't know about individual services │
│ │
│ PATTERN 2: BFF (Backend for Frontend) │
│ ───────────────────────────────────── │
│ Frontend ──GraphQL──► BFF ──gRPC/REST──► Services │
│ • BFF aggregates data from multiple services │
│ • Tailored to frontend needs (no over/under-fetching) │
│ • Frontend team owns the BFF │
│ │
│ PATTERN 3: Direct Service Calls (rare, not recommended) │
│ ──────────────────────────────────────────────────── │
│ Frontend ──REST──► Service A │
│ Frontend ──REST──► Service B │
│ • Tight coupling, CORS issues, no centralized auth │
│ • Only for very simple architectures │
│ │
│ PATTERN 4: Real-Time Layer │
│ ────────────────────────── │
│ Frontend ──WebSocket──► WS Gateway ──Pub/Sub──► Services │
│ Frontend ──SSE──► Event Gateway ──Kafka──► Services │
│ • Separate real-time channel from request-response │
│ • Gateway subscribes to event bus, pushes to connected clients │
│ │
│ PATTERN 5: GraphQL Federation │
│ ───────────────────────────── │
│ Frontend ──GraphQL──► Apollo Gateway ──► Subgraph A (Users) │
│ ──► Subgraph B (Products) │
│ ──► Subgraph C (Orders) │
│ • Each microservice exposes a GraphQL subgraph │
│ • Apollo Router/Gateway composes them into a unified schema │
│ • Frontend queries one schema, gateway resolves across services │
│ │
└──────────────────────────────────────────────────────────────────┘
┌───────────────┐ Event occurs ┌────────────────────┐
│ Stripe │ ──POST /webhook───► │ Your Server │
│ (3rd party) │ (HTTP callback) │ (listener endpoint)│
└───────────────┘ └────────────────────┘
// ─── Webhook Receiver (Express) ───
app.post('/webhooks/stripe', express.raw({ type: 'application/json' }), (req, res) => {
const sig = req.headers['stripe-signature'];
try {
// Verify signature (prevent forgery)
const event = stripe.webhooks.constructEvent(req.body, sig, WEBHOOK_SECRET);
switch (event.type) {
case 'payment_intent.succeeded':
handlePaymentSuccess(event.data.object);
break;
case 'invoice.payment_failed':
handlePaymentFailure(event.data.object);
break;
}
res.status(200).json({ received: true }); // ACK quickly
} catch (err) {
res.status(400).send(`Webhook Error: ${err.message}`);
}
});
Webhook Best Practices:
| Scenario | Recommended Pattern | Why |
|---|---|---|
| Frontend fetching user profile | API Gateway + REST/GraphQL | Simple CRUD, request-response |
| Real-time chat in browser | WebSocket via WS Gateway | Bidirectional, low-latency |
| Order processing pipeline | Event-driven (Kafka/RabbitMQ) | Async, multi-step, decoupled |
| Service-to-service data fetch | gRPC | Fast, typed, streaming support |
| 3rd-party payment notification | Webhook | Server-to-server push callback |
| Live dashboard updates | SSE via Event Gateway | Server-to-client, auto-reconnect |
| Video call feature | WebRTC (+ signaling server) | P2P media, ultra-low latency |
| Multi-service data aggregation | BFF or GraphQL Federation | Reduce round trips, tailor responses |
| Distributed transaction | Saga (choreography/orchestration) | No distributed DB transactions |
| Prevent cascading failures | Circuit Breaker | Fail-fast, graceful degradation |
| Feature | HTTP | Short Poll | Long Poll | SSE | WebSocket | WebRTC |
|---|---|---|---|---|---|---|
| Direction | Client → Server | Client → Server | Client → Server | Server → Client | Bidirectional | Bidirectional (P2P) |
| Latency | Per request | Up to interval | Near real-time | Real-time | Real-time | Ultra-low |
| Connection | Short-lived | Short-lived | Medium-lived | Long-lived | Long-lived | Long-lived (P2P) |
| Protocol | HTTP | HTTP | HTTP | HTTP | WS (TCP) | UDP/TCP |
| Binary Data | ✅ | ✅ | ✅ | ❌ (text only) | ✅ | ✅ |
| Auto-reconnect | N/A | N/A | Manual | ✅ Built-in | ❌ Manual | ❌ Manual |
| Browser Support | Universal | Universal | Universal | Modern (IE❌) | Modern | Modern |
| Max Connections | 6/domain (H1) | 6/domain (H1) | 6/domain (H1) | 6/domain (H1) | No HTTP limit | No HTTP limit |
| Scalability | ★★★★★ | ★★★☆☆ | ★★★☆☆ | ★★★★☆ | ★★★☆☆ | ★★★★★ (P2P) |
| Complexity | ★☆☆☆☆ | ★☆☆☆☆ | ★★☆☆☆ | ★★☆☆☆ | ★★★☆☆ | ★★★★★ |
| Server Cost | Low | Medium-High | High | Medium | High | Low (P2P) |
Start
│
▼
Need real-time data?
/ \
No Yes
│ │
▼ ▼
Use HTTP/REST Need bidirectional?
or GraphQL / \
No Yes
│ │
▼ ▼
Server → Client Need media/P2P?
only? / \
/ \ No Yes
Yes No │ │
│ │ ▼ ▼
▼ ▼ WebSocket WebRTC
Can use SSE? Use Long
/ \ Polling
Yes No
│ │
▼ ▼
SSE Long Polling
┌─────────────────────────────────────────────┐
│ Quick Rules: │
│ │
│ • CRUD / one-time data → HTTP │
│ • Dashboard refresh → Short Polling │
│ • Notifications, feeds → SSE │
│ • Chat, collaboration → WebSocket │
│ • Video call → WebRTC │
│ • Already on GraphQL → GraphQL Subscriptions│
└─────────────────────────────────────────────┘
┌──────────┐ WebSocket ┌──────────────┐ Pub/Sub ┌───────┐
│ Browser │◄══════════════►│ WS Gateway │◄═══════════════►│ Redis │
│ (React) │ │ (Load Bal.) │ │ Pub/Sub│
└──────────┘ └──────────────┘ └───────┘
│ │
┌─────▼─────┐ ┌─────▼─────┐
│ Message │ │ Presence │
│ Service │ │ Service │
└───────────┘ └───────────┘
Why WebSocket? Bidirectional: users send AND receive messages. Typing indicators, read receipts all need server push + client push.
┌──────────┐ SSE ┌──────────────┐
│ Browser │◄════════════════│ Score API │◄── Score Feed
│ (React) │ │ Server │
└──────────┘ └──────────────┘
Why SSE? Unidirectional: server pushes scores. Client only reads. Auto-reconnection is a bonus.
┌────────┐ WebSocket ┌────────────┐ CRDT/OT ┌──────────┐
│ User A │◄══════════►│ Collab │◄══════════►│ DB │
│ Browser │ │ Server │ └──────────┘
└────────┘ └────────────┘
┌────────┐ WebSocket ▲
│ User B │◄════════════════┘
│ Browser │
└────────┘
Why WebSocket? Both users send edits AND receive others' edits in real-time. OT/CRDT operations need bidirectional, low-latency channel.
┌──────────────┐
Driver App ──HTTP POST──►│ Location │
(GPS updates) │ Ingestion │
│ Service │
└──────┬───────┘
│
┌──────▼───────┐
Rider App ◄─── SSE/WS ──│ Tracking │
(map updates) │ Service │
└──────────────┘
Why SSE for rider? Rider only receives location updates (unidirectional). Driver sends via HTTP POST (infrequent, bursty).
| Aspect | SSE | WebSocket |
|---|---|---|
| Direction | Server → Client only | Full-duplex (both ways) |
| Protocol | HTTP | Separate WS protocol over TCP |
| Data format | Text only | Text + Binary |
| Reconnection | Automatic (built-in) | Manual implementation needed |
| Browser API | EventSource |
WebSocket |
| Use case | Notifications, live feeds | Chat, gaming, collaboration |
Key insight: Use SSE when you only need server-to-client push (simpler). Use WebSocket when you need bidirectional communication.
onclose event fires on the client.1s → 2s → 4s → 8s → ... to avoid thundering herd.GET with Upgrade: websocket header and a random Sec-WebSocket-Key.101 Switching Protocols and a Sec-WebSocket-Accept (computed from the client's key + a magic GUID).EventSource has built-in auto-reconnection.retry: field) and reconnects.Last-Event-ID header with the last received id: value.When a server goes down, all connected clients (thousands) detect the disconnection simultaneously and try to reconnect at the same time → overwhelming the server.
Solutions:
delay = baseDelay * 2^attempt
delay = baseDelay * 2^attempt + random(0, 1000)
503 Retry-After
┌────────┐ SSE/WS ┌────────────┐ Subscribe ┌───────┐
│ Client │◄════════════│ Notif. │◄════════════════│ Redis │
│ Browser │ │ Gateway │ │ Pub/Sub│
└────────┘ └────────────┘ └───────┘
▲
┌────────────┐ Publish │
│ Any Backend │═══════════════════►│
│ Service │
└────────────┘
Key decisions:
| Feature | HTTP/2 Server Push | SSE | WebSocket |
|---|---|---|---|
| Purpose | Push assets (CSS, JS) | Push events/data | Bidirectional messaging |
| Initiated by | Server (with request) | Server (after subscribe) | Either side |
| Use case | Asset preloading | Live data feeds | Chat, games |
| Client control | No (server decides) | Yes (EventSource) | Yes (send/receive) |
| Status | Deprecated in Chrome | Active & supported | Active & supported |
Note: HTTP/2 Server Push was designed for assets, not application data. It has been removed from Chrome (2022) due to low real-world benefit. Don't confuse it with SSE.
| Option | Approach | Trade-off |
|---|---|---|
| Token in URL | new WebSocket('wss://...?token=jwt') |
Simple but token leaks in logs/history |
| Cookie-based | Cookies sent automatically during handshake | Works with existing session auth |
| Auth after connect | First message = { type: 'auth', token }
|
Recommended; server closes if invalid |
| Ticket-based | Get one-time ticket via HTTP, connect with ticket | Most secure; short TTL, single use |
Answer:
Answer:
Microservices use a combination of synchronous and asynchronous communication:
Synchronous (request-response):
Asynchronous (event-driven):
Patterns:
Answer:
| Use gRPC when... | Use REST when... |
|---|---|
| Internal service-to-service calls | Public-facing APIs |
| High throughput / low latency needed | Human-readable debugging needed |
| Strong typing and contracts matter | Simplicity is prioritized |
| Streaming (server, client, bidi) needed | Browser clients (without proxy) |
| Polyglot microservices (codegen for any lang) | Wide ecosystem tooling needed |
Key insight: Many architectures use BOTH — REST/GraphQL for external (browser-facing) traffic and gRPC for internal inter-service calls.
Answer:
In microservices, you can't use a single DB transaction across services. The Saga pattern breaks a transaction into a sequence of local transactions, each with a compensating action (rollback).
Example: E-commerce order flow:
If step 3 fails, the saga runs compensations in reverse: refund card → cancel order.
Two approaches:
Answer:
A service mesh (e.g., Istio, Linkerd) adds a sidecar proxy (like Envoy) alongside each microservice to handle networking concerns transparently:
Frontend relevance: When you wonder why your API calls have retries, timeouts, or auth "built in" without explicit code — a service mesh may be handling it at the infrastructure level. Understanding this helps in debugging latency and failure scenarios.
| Question | Answer |
|---|---|
| WebSocket port? | Same as HTTP: 80 (ws://) and 443 (wss://) |
| SSE max connections? | 6 per domain on HTTP/1.1, unlimited on HTTP/2 |
| WebSocket frame overhead? | 2-14 bytes (vs HTTP headers ~200-800 bytes) |
| Can SSE send binary? | No, text only. Use base64 encoding as workaround. |
| Socket.IO = WebSocket? | No. Socket.IO is a library that CAN use WebSocket but also falls back to Long Polling. It adds rooms, acknowledgements, broadcasting. |
| Is WebSocket RESTful? | No. WS is stateful and doesn't follow REST principles. |
| gRPC vs REST? | gRPC: binary (protobuf), typed, streaming. REST: text (JSON), flexible, simpler. |
| WebRTC need server? | Yes, for signaling. Data transfer is P2P after connection. |
| STUN vs TURN? | STUN discovers public IP (lightweight). TURN relays data (expensive fallback). |
| SFU vs MCU? | SFU forwards streams (low CPU). MCU mixes streams into one (high CPU). |
| Kafka vs RabbitMQ? | Kafka: event log with replay. RabbitMQ: traditional message queue. |
| What is a BFF? | Backend for Frontend — a per-client API layer that aggregates microservice data. |
| Circuit breaker? | Stops calling a failing service after N failures. Retries after a timeout. |
Interview Tip: When asked "How would you build X in real-time?", structure your answer as:
- Identify the data flow — unidirectional or bidirectional?
- Pick the protocol — SSE for server push, WS for bidirectional, Short Poll as fallback
- Address scaling — Pub/Sub layer, sticky sessions, horizontal scaling
- Handle failures — Reconnection strategy, message buffering, offline support
- Security — Auth mechanism, rate limiting, origin validation
2026-03-14 20:23:38
## A Survey of LLM-based Deep Search Agents
When I first read these two papers, my immediate thought was how closely they relate to the concepts we learn in our Artificial Intelligence course, especially search algorithms and intelligent agents. In class we usually study algorithms like BFS, DFS, Best-First Search, and A* using small graph examples. At first these problems can feel very academic. However, while reading these papers, I realized that the same ideas are actively being extended and used in modern AI systems, especially when combined with Large Language Models (LLMs).
Both papers approach the idea of intelligent search and planning, but from different angles. One focuses on how LLM-based agents perform deep search, while the other proposes improvements to classical path-planning algorithms using Weighted A* and heuristic rewards.
The goal of this paper is to review and analyze how Large Language Models can act as reasoning agents that perform deep search over possible solutions. Traditional search algorithms explore a state space systematically, but LLM-based agents introduce the ability to reason about the search process itself.
Instead of blindly expanding nodes, these agents can:
This connects strongly to the agent models we study in AI. In our coursework, we learn about:
LLM-based deep search agents resemble goal-based and utility-based agents, because they evaluate possible actions and choose those that move closer to the goal.
For example, when solving complex reasoning tasks, an LLM agent can:
This resembles Best-First Search, but guided by language-based reasoning instead of a purely mathematical heuristic.
The second paper focuses on improving path-planning algorithms, particularly the A* search algorithm.
In standard A* search, the evaluation function is:
The total score is calculated as:
f(n)=g(n)+h(n)
However, the paper proposes using Weighted A* to prioritize heuristic information more strongly:
f(n) = g(n) + w \times h(n)
Here w is a weight that increases the importance of the heuristic estimate.
The paper further introduces heuristic rewards, which allow the algorithm to dynamically adjust its guidance based on the environment. Instead of relying only on static heuristics, the system can learn or adapt its evaluation during the search.
This modification is especially useful in environments where:
This concept directly connects to our AI course topics such as:
A practical real-world application of these ideas is autonomous delivery robots used in smart cities or warehouses.
Imagine a robot delivering packages inside a large warehouse.
Using Classical A*
With standard A* search:
However, this approach ignores many real-world factors such as:
With the method described in the second paper:
For example:
Path Distance ObstacleRisk HeuristicReward Result
Path A Short High Low Avoid
Path B Medium Low High Choose
Path C Long Medium Medium Backup
Even if Path A is shorter, the algorithm may select Path B because it is safer and faster overall.
This leads to:
The most interesting insight is that both papers complement each other.
A future intelligent system could combine both approaches:
This combination could power systems like:
While reading the papers manually, I noticed that both emphasize the importance of hybrid AI systems. Classical algorithms are not replaced by modern AI models; instead, they are enhanced by them.
Combining symbolic search algorithms with neural models is a growing research direction.
NotebookLM also helped summarize complex sections of the papers and made it easier to understand how these algorithms scale to real-world environments
Reading these papers helped me connect our AI course concepts with real research developments. Algorithms like A* that we practice in programming assignments are still fundamental in modern AI systems.
What has changed is that researchers are now integrating them with large language models and adaptive heuristics to make them more intelligent and flexible.
This shows that learning classical algorithms is still extremely valuable because they form the foundation for advanced AI systems.
Mention:
@raqeeb_26
2026-03-14 20:21:41
WordPress is better when you need e-commerce, complex plugins, or non-technical content editors. Hugo is better for developers who want speed, security, and simplicity. Most personal blogs and documentation sites should use Hugo. Most business sites and online stores should use WordPress.
WordPress is the world's most popular CMS, powering 43% of all websites. It's a full dynamic CMS with a visual editor, 60,000+ plugins, and thousands of themes. Hugo is a static site generator that compiles Markdown into HTML files — no database, no runtime, no admin UI.
| Feature | WordPress | Hugo |
|---|---|---|
| Type | Dynamic CMS | Static generator |
| Admin UI | Yes (Gutenberg editor) | No |
| Plugin ecosystem | 60,000+ | N/A (themes only) |
| E-commerce | WooCommerce | No |
| Database | MySQL/MariaDB | None |
| Runtime | PHP + Apache/Nginx | None (static HTML) |
| RAM usage | ~200-400 MB | ~10-20 MB (Nginx serving) |
| Build speed | N/A (dynamic) | Milliseconds per page |
| Security surface | High (plugins, PHP, database) | Minimal (static files only) |
| SEO | Via plugins (Yoast, RankMath) | Theme-dependent |
| Multi-author | Yes (user roles) | Via frontmatter |
| Comments | Built-in | External service |
| Forms | Plugin (Contact Form 7, etc.) | External service |
| Search | Built-in + plugins | Client-side (Pagefind) |
| Content format | Database | Markdown files (Git) |
| Update maintenance | Frequent (core + plugins + themes) | None (static output) |
WordPress is straightforward with Docker — two containers (PHP app + MariaDB). The web-based installer handles initial setup. But ongoing maintenance is significant: you'll need to keep WordPress core, themes, and plugins updated. Security patches are frequent.
Hugo has no running service to maintain. Build the site, copy the files to a web server. The Docker setup is a multi-stage build (Hugo builds, Nginx serves). No database to manage, no PHP updates, no plugin vulnerabilities to patch.
Hugo wins on performance by an order of magnitude. Static HTML served by Nginx handles thousands of concurrent users on a $5/month VPS. WordPress requires PHP processing for every page view (though caching plugins like WP Super Cache can mitigate this significantly).
| Metric | WordPress | Hugo |
|---|---|---|
| RAM (idle) | 200-400 MB | 10-20 MB |
| Page load (uncached) | 500-2000ms | 50-100ms |
| Page load (cached) | 100-300ms | 50-100ms |
| Concurrent users (1 CPU VPS) | ~50-100 | ~1000+ |
WordPress has the largest CMS community in the world. Any problem you encounter has been solved before. The plugin ecosystem means you can add nearly any functionality without writing code.
Hugo has a strong developer community but expects command-line proficiency. You won't find a drag-and-drop page builder. The theme ecosystem is smaller but growing.
Hugo wins for developers and technical users. If you can write Markdown and push to Git, Hugo gives you a faster, more secure, cheaper-to-host website with zero maintenance overhead. It's the better choice for personal blogs, documentation, and portfolio sites.
WordPress wins for businesses and non-technical teams. If you need an online store, member areas, contact forms, SEO plugins, and a visual editor that your marketing team can use — WordPress delivers. The maintenance burden is real, but the flexibility is unmatched.
Yes. Use the wordpress-to-hugo-exporter plugin or export WordPress XML and convert it with tools like wp2hugo. Posts convert to Markdown with frontmatter. Images need to be moved manually to Hugo's static/ directory. Plugins, themes, and dynamic features (comments, forms, search) will not transfer — you need Hugo equivalents or third-party services.
Not easily without additional tooling. Hugo has no admin panel — content is Markdown files in a Git repository. You can add a headless CMS like Decap CMS, Tina, or Forestry to provide a web-based editor backed by Git. This adds complexity but makes Hugo accessible to non-developers.
Hugo — by a wide margin. A Hugo site is static HTML. There is no server-side code, no database, no PHP, no login page, no plugin vulnerabilities. The attack surface is essentially zero. WordPress is frequently targeted because it runs PHP, has a public login page (/wp-admin), and its plugin ecosystem includes poorly maintained code. Security plugins (Wordfence, Sucuri) help, but they are mitigations for a fundamentally larger attack surface.
Yes. Hugo has built-in multi-language support with per-language configuration, content directories, and URL routing. WordPress also supports multi-language via plugins (WPML, Polylang), but these are paid plugins that add complexity. Hugo's implementation is simpler and does not require additional plugins or database overhead.
Hugo. A Hugo site is just HTML files — host it for free on Cloudflare Pages, Netlify, GitHub Pages, or Vercel. WordPress requires a VPS with PHP and MySQL ($5-12/month minimum). Over a year, the hosting cost difference is $60-$144 — meaningful for personal projects.
Not natively. Hugo generates static HTML with no server-side logic. You can integrate third-party services like Snipcart, Stripe Checkout, or Shopify's Buy Button for simple product sales. For a full-featured online store with inventory, shipping, and order management, WordPress with WooCommerce is the only realistic self-hosted option.
2026-03-14 20:21:03
Two days ago I shared the architecture flow diagram of my SaaS project LeadIt.
Today I want to share what actually happened after that.
Over the last two days, I started turning that architecture into real working modules. Instead of just diagrams and ideas, LeadIt now has actual APIs, a company analysis engine, and an AI-powered outreach generator.
This post is basically a build log of what I implemented and what I learned while building my first SaaS product.
What is LeadIt?
LeadIt is a project I’m building to help with B2B lead discovery and AI-powered outreach.
The idea is simple:
Instead of manually researching companies and writing cold emails, the system tries to automate most of that process.
1. Setting Up the Company Search API
The first thing I worked on was the database layer.
I connected my backend with Supabase and verified that I could fetch company records successfully.
Once the connection was stable, I built a Company Search API.
This API supports:
Pagination was important because once the database grows, returning thousands of records in a single request would slow everything down.
I also made the response lightweight by returning only the required fields instead of full database objects.
After testing the endpoint, it successfully fetched companies like:
Seeing clean API responses was a small but satisfying milestone.
2. Building the Company Analyze Engine
Once companies could be fetched from the database, the next step was analyzing them.
For this, I built what I call the Company Analyze Engine.
The idea is to automatically scan a company website and detect useful signals.
To achieve this, I used Playwright for automated browsing.
The engine visits key pages of a company website:
To keep the scraper lightweight and fast, I blocked heavy resources like:
This significantly improves scraping speed.
During analysis, the engine looks for business signals, such as:
These signals help determine whether the company might be a good B2B opportunity.
3. Building the Lead Scoring Engine
After detecting signals, I implemented a rule-based lead scoring system.
The goal here is to turn raw signals into a clear opportunity score.
For example:
The system calculates a lead opportunity score and also explains the reason behind the score.
The company endpoint now returns:
This module basically acts as the intelligence layer of LeadIt.
4. Building the AI Outreach Generator
Once the system identifies potential opportunities, the next step is outreach.
So I built the AI Outreach Generator.
This module generates personalized cold emails using:
For the AI model, I integrated Groq LLM using the llama-3.1-8b-instant model.
To make the emails more effective, I designed three outreach styles:
Observation Style: Point out something interesting about the company.
Opportunity Style: Suggest a possible improvement or opportunity.
Curiosity Style: Spark curiosity to encourage a reply.
The AI response is then parsed into structured output:
The endpoint now takes company signals and generates context-aware outreach emails automatically.
5. Production Considerations
Even though this is an early version, I tried to keep production stability in mind.
Some safeguards I added include:
These help prevent abuse and keep the system stable.
6. Debugging Moment: Tailwind CSS Version Conflict
Not everything went smoothly.
While setting up Next.js 14, I kept getting repeated build errors related to Tailwind CSS.
After debugging configs and dependencies, I realized the issue:
I had installed Tailwind CSS v4, which is designed for Next.js 15.
But my project runs on Next.js 14, which caused PostCSS and CSS compilation errors.
The fix was simply downgrading to: Tailwind CSS v3
Once I did that, the build errors disappeared.
A small mistake, but a good reminder about framework compatibility.
Where LeadIt Stands Now
After the last two days of development, LeadIt can now:
This is starting to look like the foundation of an automated B2B lead generation platform.
Still early days, but it’s exciting to see the architecture slowly turn into a real product.
Final Thoughts
Building your first SaaS product is chaotic.
You spend hours debugging small things.
You question your architecture decisions.
You rewrite code multiple times.
But the moment your system actually starts working — APIs responding, AI generating emails, data flowing — it feels incredible.
LeadIt is still early, but the core engine is finally starting to work.
And that feels like real progress.
If you're also building a SaaS or experimenting with AI tools,
I'd love to hear what you're working on.
2026-03-14 20:17:40
If you have been writing JavaScript for a few years, you are probably intimately familiar with async/await. It makes asynchronous code look synchronous, keeps the event loop unblocked, and is universally loved.
But async/await hides a massive system design bottleneck: Buffering.
If you are building data-intensive applications — proxying AWS S3 uploads, generating massive CSVs from MongoDB, or rendering complex Next.js Server Components — relying solely on buffered data will inevitably lead to Out of Memory (OOM) crashes and abysmal Time to First Byte (TTFB).
Let's break down the real-world problems buffering causes, understand the mental model of Streams, and look at how to properly architect streaming solutions across Node.js, Next.js, and databases.
When you use standard await to read a file or fetch an API, the underlying engine loads 100% of that data into RAM before your code executes the next line.
Imagine an Express route that downloads a 2GB video file for a user:
// 🚨 THE WRONG WAY: Buffering
app.get('/download', async (req, res) => {
// Node reads the ENTIRE 2GB file into RAM right now.
const videoData = await fs.promises.readFile('./massive-video.mp4');
res.send(videoData);
});
If your AWS EC2 instance has 1GB of RAM, this code instantly crashes your Node process with Fatal Error: heap out of memory. If you have 8GB of RAM, it only takes four concurrent users to kill your server.
If your React frontend fetches a massive JSON payload, the browser downloads the entire payload, holds it in memory, and waits for the connection to close before parsing it. The user stares at a blank screen or a spinner until the very last byte arrives.
The intended outcome of Streams is to fix this by breaking data into manageable "chunks." You process chunk 1 while chunk 2 is downloading — keeping your server's memory footprint flat and delivering data to the frontend instantly.
The most confusing part of modern JavaScript architecture is that there are two entirely different Stream APIs. With the rise of the Next.js App Router and Edge computing, these two worlds are crashing into each other.
Node.js Streams (node:stream) — The backend heavyweights. These belong strictly to the Node runtime. They use .pipe() and .pipeline(). You use these for filesystem operations, heavy local data processing, and traditional Express req/res objects.
The Web Streams API (ReadableStream) — The modern standard. Originally built for the browser (like the fetch API response body), but now natively used by Next.js Edge Functions, Route Handlers, and Cloudflare Workers. They use .pipeTo() and .pipeThrough().
⚠️ Mixing these up is the number one cause of bugs when migrating an Express app to a Next.js App Router API.
Let's look at how Streams are actually used in real environments, evaluating the constraints and correct implementations.
The Problem: You need to export 1 million user records to a CSV file.
The Constraint: Running await User.find({}) will load 1 million objects into your Node server's RAM, crashing it instantly.
The Solution (Node Streams):
Both MongoDB and PostgreSQL offer native stream cursors. Instead of fetching an array, you stream documents one by one, format them, and pipe them directly to the client.
// Express.js + MongoDB Example
import { pipeline } from 'node:stream/promises';
import { Transform } from 'node:stream';
app.get('/export-users', async (req, res) => {
res.setHeader('Content-Type', 'text/csv');
res.setHeader('Content-Disposition', 'attachment; filename="users.csv"');
// 1. Create a database read stream
const cursorStream = User.find().cursor();
// 2. Transform each JSON document into a CSV row on the fly
const toCsvTransform = new Transform({
objectMode: true,
transform(doc, encoding, callback) {
const csvLine = `${doc.name},${doc.email}\n`;
callback(null, csvLine);
}
});
// 3. pipeline() handles backpressure and cleans up memory if the user disconnects
try {
await pipeline(cursorStream, toCsvTransform, res);
} catch (err) {
console.error('Stream pipeline failed', err);
}
});
The Problem: LLMs take several seconds to generate a response. Waiting for the full string ruins the UX.
The Solution (Web Streams): Read the ReadableStream from the fetch API chunk-by-chunk to create a real-time "typing" effect.
// React Client Component
const handleAskAI = async () => {
const response = await fetch('/api/chat');
// Get the Web Stream reader
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Decode the raw bytes into text and update React state immediately
const chunkText = decoder.decode(value);
setChatText((prev) => prev + chunkText);
}
};
The Problem: You want to stream a large file from your server disk to the user, but Next.js Route Handlers expect a Web ReadableStream as the response — not a Node fs.ReadStream.
The Solution: Convert the Node Stream to a Web Stream. In modern Node (v16+), there is a built-in utility for this.
// src/app/api/download/route.ts
import { createReadStream } from 'node:fs';
import { Readable } from 'node:stream';
export async function GET() {
const nodeStream = createReadStream('./heavy-asset.zip');
// Convert Node.js Stream → Web Streams API ReadableStream
const webStream = Readable.toWeb(nodeStream);
return new Response(webStream, {
headers: {
'Content-Type': 'application/zip',
'Content-Disposition': 'attachment; filename="asset.zip"',
},
});
}
Streams add massive architectural complexity. Error handling is notoriously difficult because a stream can fail halfway through — for example, the user loses internet connection while downloading.
If you are fetching a standard JSON list of 50 items or doing simple CRUD operations, do not use streams. Stick to standard async/await and buffering.
✅ Use streams strictly as an accelerator when you hit the physical limits of your server's RAM, or when network latency demands immediate partial rendering.
Streams are not atomic. A 1GB download might fail at 500MB because the user closed their laptop, the browser tab crashed, or the Wi-Fi dropped.
If you do not handle this properly, the source stream (like a database cursor or a file read stream) will stay open forever, waiting to send the rest of the data. This creates a silent memory leak that will eventually take down your Node.js instance.
Here is how to handle halfway failures across different parts of the stack.
.pipe()
In older Express tutorials, you will constantly see this:
// 🚨 DANGEROUS: If 'res' closes early, 'fileStream' stays open forever.
const fileStream = fs.createReadStream('./massive.mp4');
fileStream.pipe(res);
The Fix: Always use stream.pipeline (specifically the Promise-based version). If the user disconnects or the network fails, pipeline automatically sends a destroy signal to every stream in the chain, safely freeing up your server's RAM.
import { pipeline } from 'node:stream/promises';
import fs from 'node:fs';
app.get('/download', async (req, res) => {
const fileStream = fs.createReadStream('./massive.mp4');
try {
// ✅ SAFE: pipeline monitors the connection.
// If the user drops, it destroys fileStream automatically.
await pipeline(fileStream, res);
} catch (err) {
if (err.code === 'ERR_STREAM_PREMATURE_CLOSE') {
console.warn('User canceled the download halfway through.');
} else {
console.error('Pipeline failed:', err);
}
}
});
AbortSignal
In Next.js 14+ Route Handlers, the user's browser connection is tied to the standard Web Request object.
If a user navigates away while your API is streaming a heavy database query, you need to listen to req.signal — an AbortSignal that fires the moment the client drops.
// src/app/api/heavy-export/route.ts
export async function GET(req: Request) {
const encoder = new TextEncoder();
const stream = new ReadableStream({
async start(controller) {
try {
for (let i = 0; i < 10000; i++) {
// 🛑 CRITICAL CHECK: Did the user close the browser tab?
if (req.signal.aborted) {
console.log('Client disconnected. Halting DB query.');
break;
}
// Simulate heavy DB fetch
const data = await fetchNextDatabaseRow(i);
controller.enqueue(encoder.encode(data + '\n'));
}
controller.close();
} catch (error) {
controller.error(error);
}
}
});
return new Response(stream);
}
If you are consuming a stream in the browser (like the AI "typing" effect) and the user clicks a "Stop Generating" button or navigates away, you must actively cancel the reader.
If you unmount the component without canceling, the browser will keep downloading chunks in the background — wasting the user's bandwidth.
'use client';
import { useEffect, useRef } from 'react';
export function AIChat() {
const readerRef = useRef(null);
const startStream = async () => {
const res = await fetch('/api/ai');
readerRef.current = res.body.getReader();
// ... loop through reader.read() ...
};
// 🧹 Cleanup: runs when the component unmounts
useEffect(() => {
return () => {
if (readerRef.current) {
// Instantly kills the active download stream
readerRef.current.cancel('User navigated away');
}
};
}, []);
return <button onClick={startStream}>Generate</button>;
}
| Scenario | API to Use | Key Tool |
|---|---|---|
| Express file/DB export | Node.js Streams |
pipeline() from node:stream/promises
|
| Next.js Route Handler file | Bridge both worlds | Readable.toWeb() |
| AI streaming in browser | Web Streams API | response.body.getReader() |
| Client disconnect (Next.js) | Web Streams API | req.signal.aborted |
| Component unmount cleanup | Web Streams API | reader.cancel() |
Streams are one of those fundamentals that separate developers who can architect resilient, production-grade systems from those who write code that works fine in dev — and dies the moment it sees real traffic.
2026-03-14 20:17:12
Security is one of the most critical aspects when designing cloud infrastructure. In Oracle Cloud Infrastructure, Identity and Access Management (IAM) provides a centralized framework to control access to resources and services.
IAM allows administrators to define who can access cloud resources and what actions they are allowed to perform, ensuring a secure and well-managed cloud environment.
In this article, we will explore the core IAM architecture and understand how its components work together.
Why IAM is Important
In a cloud environment, multiple users, applications, and services interact with infrastructure resources. Without proper access control, organizations risk exposing sensitive data or critical infrastructure.
OCI IAM helps organizations:
Core Components of OCI IAM
OCI IAM is built using several key components.
Compartments
Compartments are logical containers used to organize and isolate OCI resources.
They allow administrators to structure cloud environments and apply access control boundaries.
Example compartment hierarchy:
Root Tenancy
│
├── Development
│ ├── Compute
│ └── Storage
│
└── Production
├── Application Servers
└── Databases
This structure helps maintain clear separation between environments.
Users and Groups
Users represent identities that can access the OCI Console or APIs.
Groups are collections of users with similar responsibilities.
Instead of assigning permissions to individual users, administrators assign policies to groups.
Example:
Group: DevOps
Users:
This simplifies permission management across teams.
IAM Policies
Policies define what actions users or groups are allowed to perform on OCI resources.
Example policy:
Allow group DevOps to manage instance-family in compartment Production
Policies usually define:
Policies form the core of OCI authorization.
Dynamic Groups and Instance Principals
Modern cloud applications often run on compute instances and need access to OCI services.
Instead of storing API credentials on servers, OCI provides Instance Principals.
Instance principals allow compute instances to authenticate with OCI services using instance identity.
Example access flow:
Compute Instance
│
▼
Instance Principal
│
▼
Dynamic Group
│
▼
IAM Policy
│
▼
OCI Service Access
Dynamic groups automatically include instances based on matching rules.
Example dynamic group rule:
ALL {instance.compartment.id = ''}
Example policy:
Allow dynamic-group app-instances to read buckets in compartment Storage
This architecture eliminates the need to store credentials on servers.
*Real-World Example
*
Imagine an application running on an OCI compute instance that needs to upload files to Object Storage.
This enables secure and automated access to OCI services.
Best Practices for OCI IAM
When designing IAM architecture in OCI, follow these best practices:
*Conclusion
*
Identity and Access Management is a foundational security service in Oracle Cloud Infrastructure. By combining compartments, users, groups, policies, and dynamic groups, organizations can build a secure access control framework for their cloud environments.
Understanding IAM architecture is essential for designing secure and scalable OCI workloads.
GitHub Repository
You can explore the complete IAM implementation and architecture documentation here:
OCI IAM deep dive covering users, groups, policies, dynamic groups, instance principals and advanced access patterns in Oracle Cloud Infrastructure.
Identity and Access Management (IAM) is the security foundation of Oracle Cloud Infrastructure (OCI). It controls authentication and authorization for users, services, and applications interacting with cloud resources.
OCI IAM allows administrators to define who can access resources and what actions they can perform through policies, groups, and dynamic access mechanisms.
This repository provides an in-depth explanation of OCI IAM components and advanced access patterns used in enterprise cloud environments.
OCI IAM consists of several key components:
These components work together to implement secure access control across OCI services.
Typical access flow:
User │ ▼ OCI IAM │ ▼ Group Membership │ ▼ Policy Evaluation │ ▼ Access to OCI Resource