2025-05-20 23:31:05
Bugs sneak out when less than 80% of user flows are tested before shipping. However, getting that kind of coverage (and staying there) is hard and pricey for any team.
QA Wolf’s AI-native service provides high-volume, high-speed test coverage for web and mobile apps, reducing your organizations QA cycle to less than 15 minutes.
They can get you:
24-hour maintenance and on-demand test creation
Zero flakes, guaranteed
Engineering teams move faster, releases stay on track, and testing happens automatically—so developers can focus on building, not debugging.
Drata’s team of 80+ engineers achieved 4x more test cases and 86% faster QA cycles.
⭐ Rated 4.8/5 on G2
Disclaimer: The details in this post have been derived from the articles/videos shared online by the Facebook/Meta engineering team. All credit for the technical details goes to the Facebook/Meta Engineering Team. The links to the original articles and videos are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.
Facebook didn’t set out to dominate live video overnight. The platform’s live streaming capability began as a hackathon project with the modest goal of seeing how fast they could push video through a prototype backend. It gave the team a way to measure end-to-end latency under real conditions. That test shaped everything that followed.
Facebook Live moved fast by necessity. From that rooftop prototype, it took just four months to launch an MVP through the Mentions app, aimed at public figures like Dwayne Johnson. Within eight months, the platform rolled out to the entire user base, consisting of billions of users.
The video infrastructure team at Facebook owns the end-to-end path of every video. That includes uploads from mobile phones, distributed encoding in data centers, and real-time playback across the globe. They build for scale by default, not because it sounds good in a deck, but because scale is a constraint. When 1.2 billion users might press play, bad architecture can lead to issues.
The infrastructure needed to make that happen relied on foundational principles: composable systems, predictable patterns, and sharp handling of chaos. Every stream, whether it came from a celebrity or a teenager’s backyard, needed the same guarantees: low latency, high availability, and smooth playback. And every bug, every outage, every unexpected spike forced the team to build smarter, not bigger.
In this article, we’ll look at how Facebook Live was built and the kind of challenges they faced.
Engineering hiring is booming again: U.S. companies with revenue of $50 million+ are anticipating a 12% hiring increase compared with 2024.
Employers and candidates are wondering: how do remote software engineer salaries compare across global markets?
Terminal’s Remote Software Engineer Salary Report includes data from 260K+ candidates across Latin America, Canada and Europe. Employers can better inform hiring decisions and candidates can understand their earning potential.
Our hiring expertise runs deep: Terminal is the smarter platform for hiring remote engineers. We help you hire elite engineering talent up to 60% cheaper than U.S. talent.
At the heart of Facebook’s video strategy lies a sprawling infrastructure. Each component serves a specific role in making sure video content flows smoothly from creators to viewers, no matter where they are or what device they’re using.
See the diagram below that shows a high-level view of this infrastructure:
The upload pipeline is where the video journey begins.
It handles everything from a celebrity’s studio-grade stream to a shaky phone video in a moving car. Uploads must be fast, but more importantly, they must be resilient. Network drops, flaky connections, or device quirks shouldn’t stall the system.
Uploads are chunked to support resumability and reduce retry cost.
Redundant paths and retries protect against partial failures.
Metadata extraction starts during upload, enabling early classification and processing.
Beyond reliability, the system clusters similar videos. This feeds recommendation engines that suggest related content to the users. The grouping happens based on visual and audio similarity, not just titles or tags. That helps surface videos that feel naturally connected, even if their metadata disagrees.
Encoding is a computationally heavy bottleneck if done naively. Facebook splits incoming videos into chunks, encodes them in parallel, and stitches them back together.
This massively reduces latency and allows the system to scale horizontally. Some features are as follows:
Each chunk is independently transcoded across a fleet of servers.
Bitrate ladders are generated dynamically to support adaptive playback.
Reassembly happens quickly without degrading quality or syncing.
This platform prepares content for consumption across every device class and network condition. Mobile users in rural zones, desktop viewers on fiber, everyone gets a version that fits their bandwidth and screen.
Live streams add a layer of complexity. Unlike uploaded videos, live content arrives raw, gets processed on the fly, and must reach viewers with minimal delay. The architecture must absorb the chaos of real-time creation while keeping delivery tight and stable.
Broadcast clients (phones, encoders) connect via secure RTMP to entry points called POPs (Points of Presence).
Streams get routed through data centers, transcoded in real time, and dispatched globally.
Viewers watch through mobile apps, desktop browsers, or APIs.
This is like a two-way street. Comments, reactions, and viewer engagement flow back to the broadcaster, making live content deeply interactive. Building that loop demands real-time coordination across networks, services, and user devices.
Scaling Facebook Live is about building for a reality where “peak traffic” is the norm. With over 1.23 billion people logging in daily, the infrastructure must assume high load as the baseline, not the exception.
Some scaling requirements were as follows:
This wasn’t a typical SaaS model growing linearly. When a product like Facebook Live goes global, it lands in every timezone, device, and network condition simultaneously.
The system must perform across the globe in varying conditions, from rural to urban. And every day, it gets pushed by new users, new behaviors, and new demands. Almost 1.23 billion daily active users formed the base load. Traffic patterns should follow cultural, regional, and global events.
To keep latency low and reliability high, Facebook uses a combination of Points of Presence (POPs) and Data Centers (DCs).
POPs act as the first line of connection, handling ingestion and local caching. They sit closer to users and reduce the hop count.
DCs handle the heavy lifting: encoding, storing, and dispatching live streams to other POPs and clients.
This architecture allows for regional isolation and graceful degradation. If one POP goes down, others can pick up the slack without a central failure.
Here are some key scaling challenges Facebook faced that potentially created issues:
Concurrent Stream Ingestion: Handling thousands of concurrent broadcasters at once is not trivial. Ingesting and encoding live streams requires real-time CPU allocation, predictable bandwidth, and a flexible routing system that avoids bottlenecks.
Unpredictable Viewer Surges: Streams rarely follow a uniform pattern. One moment, a stream has minimal viewers. Next, it's viral with 12 million. Predicting this spike is nearly impossible, and that unpredictability wrecks static provisioning strategies. Bandwidth consumption doesn’t scale linearly. Load balancers, caches, and encoders must adapt in seconds, not minutes.
Hot Streams and Viral Behavior: Some streams, such as political events, breaking news, can go global without warning. These events impact the caching and delivery layers. One stream might suddenly account for 50% of all viewer traffic. The system must replicate stream segments rapidly across POPs and dynamically allocate cache layers based on viewer geography.
Streaming video live is about managing flow across an unpredictable, global network. Every live session kicks off a chain reaction across infrastructure components built to handle speed, scale, and chaos. Facebook Live’s architecture reflects this need for real-time resilience.
Live streams originate from a broad set of sources:
Phones with shaky LTE
Desktops with high-definition cameras
Professional setups using the Live API and hardware encoders
These clients create RTMPS (Real-Time Messaging Protocol Secure) streams. RTMPS carries the video payload with low latency and encryption, making it viable for casual streamers and production-level events.
POPs act as the first entry point into Facebook’s video pipeline. They’re regional clusters of servers optimized for:
Terminating RTMPS connections close to the source
Minimizing round-trip latency for the broadcaster
Forwarding streams securely to the appropriate data center
Each POP is tuned to handle a high volume of simultaneous connections and quickly routes streams using consistent hashing to distribute load evenly.
See the diagram below:
Once a POP forwards a stream, the heavy lifting happens in a Facebook data center. This is where the encoding hosts:
Authenticate incoming streams using stream tokens
Claim ownership of each stream to ensure a single source of truth
Transcode video into multiple bitrates and resolutions
Generate playback formats like DASH and HLS
Archive the stream for replay or on-demand viewing
Each data center operates like a mini CDN node, tailored to Facebook’s specific needs and traffic patterns.
Live video puts pressure on distribution in ways that on-demand video doesn’t.
With pre-recorded content, everything is cacheable ahead of time. But in a live stream, the content is being created while it's being consumed. That shifts the burden from storage to coordination. Facebook’s answer was to design a caching strategy that can support this.
The architecture uses a two-tier caching model:
POPs (Points of Presence): Act as local cache layers near users. They hold recently fetched stream segments and manifest files, keeping viewers out of the data center as much as possible.
DCs (Data Centers): Act as origin caches. If a POP misses, it falls back to a DC to retrieve the segment or manifest. This keeps encoding hosts from being overwhelmed by repeated requests.
This separation allows independent scaling and regional flexibility. As more viewers connect from a region, the corresponding POP scales up, caching hot content locally and shielding central systems.
The first time a stream goes viral, hundreds or thousands of clients might request the same manifest or segment at once. If all those hit the data center directly, the system gets into trouble.
To prevent that, Facebook uses cache-blocking timeouts:
When a POP doesn’t have the requested content, it sends a fetch upstream.
All other requests for that content are held back.
If the first request succeeds, the result populates the cache, and everyone gets a hit.
If it times out, everyone floods the DC, causing a thundering herd.
The balance is tricky:
If the timeout is too short, the herd gets unleashed too often.
If the timeout is too long, viewers start experiencing lag or jitter.
Live streams rely on manifests: a table of contents that lists available segments. Keeping these up-to-date is crucial for smooth playback.
Facebook uses two techniques:
TTL (Time to Live): Each manifest has a short expiry window, usually a few seconds. Clients re-fetch the manifest when it expires.
HTTP Push: A more advanced option, where updates get pushed to POPs in near real-time. This reduces stale reads and speeds up segment availability.
HTTP Push is preferable when tight latency matters, especially for streams with high interaction or fast-paced content. TTL is simpler but comes with trade-offs in freshness and efficiency.
Live playback is about consistency, speed, and adaptability across networks that don’t care about user experience.
Facebook’s live playback pipeline turns a firehose of real-time video into a sequence of reliable HTTP requests, and DASH is the backbone that makes that work.
DASH breaks live video into two components:
A manifest file that acts like a table of contents.
A sequence of media files, each representing a short segment of video (usually 1 second).
The manifest evolves as the stream continues. New entries are appended, old ones fall off, and clients keep polling to see what’s next. This creates a rolling window, typically a few minutes long, that defines what’s currently watchable.
Clients issue HTTP GET requests for the manifest.
When new entries appear, they fetch the corresponding segments.
Segment quality is chosen based on available bandwidth, avoiding buffering or quality drops.
This model works because it’s simple, stateless, and cache-friendly. And when done right, it delivers video with sub-second delay and high reliability.
Playback clients don’t talk to data centers directly. Instead, they go through POPs: edge servers deployed around the world.
POPs serve cached manifests and segments to minimize latency.
If a client requests something new, the POP fetches it from the nearest data center, caches it, and then returns it.
Repeat requests from nearby users hit the POP cache instead of hammering the DC.
This two-tier caching model (POPs and DCs) keeps things fast and scalable:
It reduces the load on encoding hosts, which are expensive to scale.
It localizes traffic, meaning regional outages or spikes don’t propagate upstream.
It handles unpredictable viral traffic with grace, not panic.
Facebook Live didn’t reach a billion users by accident. It got there through deliberate, pragmatic engineering. The architecture was designed to survive chaos in production.
The story begins with a clock stream on a rooftop, but it quickly shifts to decisions under pressure: picking RTMP because it worked, chunking uploads to survive flaky networks, and caching manifests to sidestep thundering herds.
A few lessons cut through all the technical layers:
Start small, iterate fast: The first version of Live aimed to be shippable. That decision accelerated learning and forced architectural clarity early.
Design for scale from day one: Systems built without scale in mind often need to be rebuilt. Live was architected to handle billions, even before the first billion arrived.
Bake reliability into architecture: Redundancy, caching, failover had to be part of the core system. Bolting them on later wouldn’t have worked.
Plan for flexibility in features: From celebrity streams to 360° video, the infrastructure had to adapt quickly. Static systems would’ve blocked product innovation.
Expect the unexpected: Viral content, celebrity spikes, and global outages aren’t edge cases but inevitable. Systems that can’t handle unpredictability don’t last long.
References:
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].
2025-05-19 23:30:40
Most AI hype today is about developer productivity and augmentation. This overshadows a more important opportunity: AI as the foundation for products and features that weren’t previously possible.
But, it’s a big jump from building Web 2.0 applications to AI-native products, and most engineers aren’t prepared. That’s why Maven is hosting 50+ short, free live lessons with tactical guidance and demos from AI engineers actively working within the new paradigm.
This is your opportunity to upskill fast and free with Maven’s expert instructors. We suggest you start with these six:
Evaluating Agentic AI Applications - Aish Reganti (Tech Lead, AWS) & Kiriti Badam (Applied AI, OpenAI)
Optimize Structured Data Retrievals with Evals - Hamel Husain (AI Engineer, ex-Github, Airbnb)
CTO Playbook for Agentic RAG - Doug Turnbull (Principal ML Engineer, ex-Reddit, Shopify)
Build Your First Agentic AI App with MCP - Rafael Pierre (Principal AI Engineer, ex-Hugging Face, Databricks)
Embedding Performance through Generative Evals - Jason Liu (ML Engineer, ex-Stitch Fix, Meta)
Design Vertical AI Agents - Hamza Farooq (Stanford instructor, ex-Google researcher)
To go deeper with these experts, use code BYTEBYTEGO to get $100 off their featured AI courses - ends June 30th.
Disclaimer: The details in this post have been derived from the articles/videos shared online by the Pinterest engineering team. All credit for the technical details goes to the Pinterest Engineering Team. The links to the original articles and videos are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.
Pinterest launched in March 2010 with a typical early-stage setup: a few founders, one engineer, and limited infrastructure. The team worked out of a small apartment. Resources were constrained, and priorities were clear—ship features fast and figure out scalability later.
The app didn’t start as a scale problem but as a side project. A couple of founders, a basic web stack, and an engineer stitching together Python scripts in a shared apartment. No one was thinking about distributed databases when the product might not survive the week.
The early tech decisions reflected this mindset. The stack included:
Python for the application layer
NGINX as a front-end proxy
MySQL with a read replica
MongoDB for counters
A basic task queue for sending emails and social updates
But the scale increased rapidly. One moment it’s a few thousand users poking around images of food and wedding dresses. Then traffic doubles, and suddenly, every service is struggling to maintain performance, logs are unreadable, and engineers are somehow adding new infrastructure in production.
This isn’t a rare story. The path from minimum viable product to full-blown platform often involves growing pains that architectural diagrams never show. Systems that worked fine for 10,000 users collapse at 100,000.
In this article, we’ll look at how Pinterest scaled its architecture to handle the scale and the challenges they faced along the way.
Pinterest’s early architecture reflected its stage: minimal headcount, fast iteration cycles, and a stack assembled more for momentum than long-term sustainability.
When the platform began to gain traction, the team moved quickly to AWS. The choice wasn’t the result of an extensive evaluation. AWS offered enough flexibility, credits were available, and the team could avoid the friction of setting up physical infrastructure.
The initial architecture looked like this:
The technical foundation included:
NGINX is the front-end HTTP server. It handled incoming requests and routed them to application servers. NGINX was chosen for its simplicity and performance, and it required little configuration to get working reliably.
Python-based web engines, four in total, processed application logic. Python offered a high development speed and decent ecosystem support. For a small team, being productive in the language mattered more than raw runtime performance.
MySQL, with a read slave, served as the primary data store. The read slave allowed some level of horizontal scaling by distributing read operations, which helped reduce load on the primary database. At this point, the schema and data model were still evolving rapidly.
MongoDB was added to handle counters. These were likely used for tracking metrics like likes, follows, or pins. MongoDB’s document model and ease of setup made it a quick solution. It wasn’t deeply integrated or tuned.
A simple task queue system was used to decouple time-consuming operations like sending emails and posting to third-party platforms (for example, Facebook, Twitter). The queue was critical for avoiding performance bottlenecks during user interactions.
This stack wasn’t optimized for scale or durability. It was assembled to keep the product running while the team figured out what the product needed to become.
As Pinterest’s popularity grew, traffic doubled every six weeks. This kind of growth puts a great strain on the infrastructure.
Pinterest hit this scale with a team of just three engineers. In response, the team added technologies reactively. Each new bottleneck triggered the introduction of a new system:
MySQL remained the core data store, but began to strain under concurrent reads and writes.
MongoDB handled counters.
Cassandra was used to handle distributed data needs.
MBase was introduced—less for fit, more because it was promoted as a quick fix.
Redis entered the stack for caching and fast key-value access.
The result was architectural entropy. Multiple databases, each with different operational behaviors and failure modes, created complexity faster than the team could manage.
Each new database seemed like a solution at first until its own set of limitations emerged. This pattern repeated: an initial phase, followed by operational pain, followed by another tool. By the time the team realized the cost, they were maintaining a fragile web of technologies they barely had time to understand.
This isn’t rare. Growth exposes every shortcut. What works for a smaller-scale project can’t always handle production traffic. Adding tools might buy time, but without operational clarity and internal expertise, it also buys new failure modes.
By late 2011, the team recognized a hard truth: complexity wasn’t worth it. They didn’t need more tools. They needed fewer, more reliable ones.
After enduring repeated failures and operational overload, Pinterest stripped the stack down to its essentials.
The architecture stabilized around three core components: MySQL, Redis, and Memcached (MIMC). Everything else (MongoDB, Cassandra, MBase) was removed or isolated.
Let’s look at each in more detail.
MySQL returned to the center of the system.
It stored all core user data: boards, pins, comments, and domains. It also became the system of record for legal and compliance data, where durability and auditability were non-negotiable. The team leaned on MySQL’s maturity: decades of tooling, robust failover strategies, and a large pool of operational expertise.
However, MySQL had one critical limitation: it didn’t scale horizontally out of the box. Pinterest addressed this by sharding and, more importantly, designing systems to tolerate that limitation. Scaling became a question of capacity planning and box provisioning, not adopting new technologies.
The diagram below shows how sharding works in general:
Redis handled problems that MySQL couldn’t solve cleanly:
Feeds: Pushing updates in real-time requires low latency and fast access patterns.
Follower graph: The complexity of user-board relationships demanded a more dynamic, memory-resident structure.
Public feeds: Redis provided a true list structure with O(1) inserts and fast range reads, ideal for rendering content timelines.
Redis was easier to operate than many of its NoSQL competitors. It was fast, simple to understand, and predictable, at least when kept within RAM limits.
Redis offers several persistence modes, each with clear implications:
No persistence: Everything lives in RAM and disappears on reboot. It’s fast and simple, but risky for anything critical.
Snapshotting (RDB): Periodically saves a binary dump of the dataset. If a node fails, it can recover from the last snapshot. This mode balances performance with recoverability.
Append-only file (AOF): Logs every write operation. More durable, but with higher I/O overhead.
Pinterest leaned heavily on Redis snapshotting. It wasn’t bulletproof, but for systems like the follower graph or content feeds, the trade-off worked: if a node died, data from the last few hours could be rebuilt from upstream sources. This avoided the latency penalties of full durability without sacrificing recoverability.
The diagram below shows snapshotting with Redis.
MySQL remained Pinterest’s source of truth, but for real-time applications, it fell short:
Write latency increased with volume, especially under high concurrency.
Tree-based structures (for example, B-trees) make inserts and updates slower and harder to optimize for queue-like workloads.
Query flexibility came at the cost of performance predictability.
Redis offered a better fit for these cases:
Feeds: Users expect content updates instantly. Redis handled high-throughput, low-latency inserts with predictable performance.
Follower graph: Pinterest’s model allowed users to follow boards, users, or combinations. Redis stores this complex relationship graph as in-memory structures with near-zero access time.
Caching: Redis served as a fast-access layer for frequently requested data like profile overviews, trending pins, or related content.
MIMC served as a pure cache layer. It didn’t try to be more than that, and that worked in its favor.
It offloaded repetitive queries, reduced latency, and helped absorb traffic spikes. Its role was simple but essential: act as a buffer between user traffic and persistent storage.
As Pinterest matured, scaling wasn’t just about systems. It was also about the separation of concerns.
The team began pulling tightly coupled components into services, isolating core functionality into defined boundaries with clear APIs.
Certain parts of the architecture naturally became services because they carried high operational risk or required specialized logic:
Search Service: Handled query parsing, indexing, and result ranking. Internally, it became a complex engine, dealing with user signals, topic clustering, and content retrieval. From the outside, it exposed a simple interface: send a query, get relevant results. This abstraction insulated the rest of the system from the complexity behind it.
Feed Service: Managed the distribution of content updates. When a user pinned something, the feed service handled propagation to followers. It also enforced delivery guarantees and ordering semantics, which were tricky to get right at scale.
MySQL Service: Became one of the first hardened services. It sat between applications and the underlying database shards. This layer implemented safety checks, access controls, and sharding logic. By locking down direct access, it avoided mistakes that previously caused corrupted writes, like saving data to the wrong shard.
Background jobs were offloaded to a system called PinLater. The model was simple: tasks were written to a MySQL-backed queue with a name, payload, and priority. Worker pools pulled from this queue and executed jobs.
This design had key advantages:
Tasks were durable.
Failure recovery was straightforward.
Prioritization was tunable without major system redesign.
PinLater replaced ad hoc queues and inconsistent task execution patterns. It introduced reliability and consistency into Pinterest’s background job landscape.
To avoid hardcoded service dependencies and brittle connection logic, the team used Zookeeper as a service registry. When an application needed to talk to a service, it queried Zookeeper to find available instances.
This offered a few critical benefits:
Resilience: Services could go down and come back up without breaking clients.
Elasticity: New instances could join the cluster automatically.
Connection management: Load balancing became less about middleware and more about real-time awareness.
As Pinterest scaled, visibility became non-negotiable. The team needed to know what was happening across the system in real-time. Logging and metrics weren’t optional but part of the core infrastructure.
The logging backbone started with Kafka, a high-throughput, distributed message broker. Every action on the site (pins, likes, follows, errors) pushed data into Kafka. Think of it as a firehose: everything flows through, nothing is lost unless explicitly discarded.
Kafka solved a few key problems:
It decoupled producers from consumers. Application servers didn’t need to know who would process the data or how.
It buffered spikes in traffic without dropping events.
It created a single source of truth for event data.
Once the data hit Kafka, it flowed into Secor, an internal tool that parsed and transformed logs. Seor broke log entries into structured formats, tagged them with metadata, and wrote them into AWS S3.
This architecture had a critical property: durability. S3 served as a long-term archive. Once the data landed there, it was safe. Even if downstream systems failed, logs could be replayed or reprocessed later.
The team used this pipeline not just for debugging, but for analytics, feature tracking, and fraud detection. The system was designed to be extensible. Any new use case could hook into Kafka or read from S3 without affecting the rest of the stack.
Kafka wasn’t only about log storage. It enabled near-real-time monitoring. Stream processors consumed Kafka topics and powered dashboards, alerts, and anomaly detection tools. The moment something strange happened, such as as spike in login failures, a drop in feed loads, it showed up immediately.
This feedback loop was essential. Pinterest didn’t just want to understand what happened after a failure. They wanted to catch it as it began.
Pinterest’s path from early chaos to operational stability left behind a clear set of hard-earned lessons, most of which apply to any system scaling beyond its initial design.
First, log everything from day one. Early versions of Pinterest logged to MySQL, which quickly became a bottleneck. Moving to a pipeline of Kafka to Seor to S3 changed the game. Logs became durable, queryable, and reusable. Recovery, debugging, analytics, everything improved.
Second, know how to process data at scale. Basic MapReduce skills went a long way. Once logs landed in S3, teams used MapReduce jobs to analyze trends, identify regressions, and support product decisions. SQL-like abstractions made the work accessible even for teams without deep data engineering expertise.
Third, instrument everything that matters. Pinterest adopted StatsD to track performance metrics without adding friction. Counters, timers, and gauges flowed through UDP packets, avoiding coupling between the application and the metrics backend. Lightweight, asynchronous instrumentation helped spot anomalies early, before users noticed.
Fourth, don’t start with complexity. Overcomplicating architecture early on, especially by adopting too many tools too fast, guarantees long-term operational pain.
Finally, pick mature, well-supported technologies. MySQL and Memcached weren’t flashy, but they worked. They were stable, documented, and surrounded by deep communities. When something broke, answers were easy to find.
Pinterest didn’t scale because it adopted cutting-edge technology. It scaled because it survived complexity and invested in durability, simplicity, and visibility. For engineering leaders and architects, the takeaways are pragmatic:
Design systems to fail visibly, not silently.
Favor tools with proven track records over tools with bold promises.
Assume growth will outpace expectations, and build margins accordingly.
References:
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].
2025-05-17 11:30:24
Modern bots are smarter than ever. They execute JavaScript, store cookies, rotate IPs, and even solve CAPTCHAs with AI. As attacks grow more sophisticated, traditional detection methods can’t reliably keep them out.
Enter WorkOS Radar, your all-in-one bot defense solution. With a single API, you can instantly secure your signup flow against brute force attacks, leaked credentials, disposable emails, and fake signups. Radar uses advanced device fingerprinting to stop bots in their tracks and keep real users flowing through.
Fast to implement and built to scale.
This week’s system design refresher:
APIs Explained in 6 Minutes! (Youtube video)
12 MCP Servers You Can Use in 2025
How to Deploy Services
The System Design Topic Map
How Transformers Architecture Works?
We’re Hiring at ByteByteGo
SPONSOR US
MCP (Model Context Protocol) is an open standard that simplifies how AI models, particularly LLMs, interact with external data sources, tools, and services. An MCP server acts as a bridge between these AI models and external tools. Here are the top MCP servers:
File System MCP Server
Allows the LLM to directly access the local file system to read, write, and create directories.
GitHub MCP Server
Connects Claude to GitHub repos and allows file updates, code searching.
Slack MCP Server
MCP Server for Slack API, enabling Claude to interact with Slack workspaces.
Google Maps MCP Server
MCP Server for Google Maps API.
Docker MCP Server
Integrate with Docker to manage containers, images, volumes, and networks.
Brave MCP Server
Web and local search using Brave’s Search API.
PostgreSQL MCP Server
An MCP server that enables LLM to inspect database schemas and execute read-only queries.
Google Drive MCP Server
An MCP server that integrates with Google Drive to allow reading and searching over files.
Redis MCP Server
MCP Server that provides access to Redis databases.
Notion MCP Server
This project implements an MCP server for the Notion API.
Stripe MCP Server
MCP Server to interact with the Stripe API.
Perplexity MCP Server
An MCP Server that connects to Perplexity’s Sonar API for real-time search.
Over to you: Which other MCP Server will you add to the list?
The zero-to-one guide for teams adopting AI coding assistants. This guide shares proven prompting techniques, the use cases that save the most time for developers, and leadership strategies for encouraging adoption. It’s designed to be something engineering leaders can distribute internally to help teams get started with integrating AI into their daily work.
Download this guide to get:
The 10 most time-saving use cases for AI coding tools
Effective AI prompting techniques from experienced AI users
Leadership strategies for encouraging AI use
Deploying or upgrading services is risky. In this post, we explore risk mitigation strategies.
The diagram below illustrates the common ones.
Multi-Service Deployment
In this model, we deploy new changes to multiple services simultaneously. This approach is easy to implement. But since all the services are upgraded at the same time, it is hard to manage and test dependencies. It’s also hard to rollback safely.
Blue-Green Deployment
With blue-green deployment, we have two identical environments: one is staging (blue) and the other is production (green). The staging environment is one version ahead of production. Once testing is done in the staging environment, user traffic is switched to the staging environment, and the staging becomes the production. This deployment strategy is simple to perform rollback, but having two identical production quality environments could be expensive.
Canary Deployment
A canary deployment upgrades services gradually, each time to a subset of users. It is cheaper than blue-green deployment and easy to perform rollback. However, since there is no staging environment, we have to test on production. This process is more complicated because we need to monitor the canary while gradually migrating more and more users away from the old version.
A/B Test
In the A/B test, different versions of services run in production simultaneously. Each version runs an “experiment” for a subset of users. A/B test is a cheap method to test new features in production. We need to control the deployment process in case some features are pushed to users by accident.
Over to you - Which deployment strategy have you used? Did you witness any deployment-related outages in production and why did they happen?
Effective system design is a game of trade-offs and requires a broad knowledge base to make the best decisions. This topic map categorizes the essential system design topics based on categories.
Application Layer: It consists of the core concepts such as availability, scalability, reliability, and other NFRs. Also covers design and architectural topics such as OOP, DDD, Microservices, Clean Architecture, Modular Monoliths, and so on.
Network & Communication: It covers communication protocols, service integration, messaging, real-time communication, and event-driven architecture.
Data Layer: It covers the basics of database systems (schema design, indexing, SQL vs NoSQL, transactions, etc), the various types of databases, and the nuances of distributed databases (replication, sharding, leader election, etc.)
Scalability & Reliability: This covers scalability strategies (horizontal, stateless, caching, partitioning, etc) and reliability strategies like load balancing, rate limiting, and so on.
Security & Observability: It covers authentication and authorization techniques (OAuth 2, JWT, PASETO, Sessions, Cookies, RBAC, etc.) and security threats. The observability area deals with topics like monitoring, tracing, and logging.
Infrastructure & Deployments: Deals with CI/CD pipelines, containerization and orchestration, serverless architecture, IaC, and disaster recovery techniques.
Over to you: What else will you add to the list?
Transformers Architecture has become the foundation of some of the most popular LLMs including GPT, Gemini, Claude, DeepSeek, and Llama.
Here’s how it works:
A typical transformer-based model has two main parts: encoder and decoder. The encoder reads and understands the input. The decoder uses this understanding to generate the correct output.
In the first step (Input Embedding), each word is converted into a number (vector) representing its meaning.
Next, a pattern called Positional Encoding tells the model where each word is in the sentence. This is because the word order matters in a sentence. For example “the cat ate the fish” is different from “the fish ate the cat”.
Next is the Multi-Head Attention, which is the brain of the encoder. It allows the model to look at all words at once and determine which words are related. In the Add & Normalize phase, the model adds what it learned from attention back into the sentence.
The Feed Forward process adds extra depth to the understanding. The overall process is repeated multiple times so that the model can deeply understand the sentence.
After the encoder finishes, the decoder kicks into action. The output embedding converts each word in the expected output into numbers. To understand where each word should go, we add Positional Encoding.
The Masked Multi-Head Attention hides the future words so the model predicts only one word at a time.
The Multi-Head Attention phase aligns the right parts of the input with the right parts of the output. The decoder looks at both the input sentence and the words it has generated so far.
The Feed Forward applies more processing to make the final word choice better. The process is repeated several times to refine the results.
Once the decoder has predicted numbers for each word, it passes them through a Linear Layer to prepare for output. This layer maps the decoder’s output to a large set of possible words.
After the Linear Layer generates scores for each word, the Softmax layer converts those scores into probabilities. The word with the highest probability is chosen as the next word.
Finally, a human-readable sentence is generated.
Over to you: What else will you add to understand the Transformer Architecture?
We're hiring 3 positions at ByteByeGo: Technical Product Manager, Technical Educator – System Design, and Sales/Partnership.
𝐏𝐫𝐨𝐝𝐮𝐜𝐭 𝐌𝐚𝐧𝐚𝐠𝐞𝐫 – 𝐓𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 (Remote, part-time)
We’re hiring a technical product manager to work with me on building an interview preparation platform. Think mock interviews, live coaching, AI assisted learning and hands-on tools that help engineers land their next role.
You’ll be responsible for defining the product strategy, prioritizing features, and working closely with me to bring ideas to life.
You must have conducted 100+ technical interviews (e.g., system design, algorithms, behavioral) and have a deep understanding of what makes a great candidate experience. Bonus if you’ve worked at a top tech company or have experience coaching candidates.
We’re looking for someone who can:
• Build from 0 to 1 with minimal guidance
• Translate user pain points into well-scoped solutions
• Iterate quickly based on feedback and data
𝐓𝐞𝐜𝐡𝐧𝐢𝐜𝐚𝐥 𝐄𝐝𝐮𝐜𝐚𝐭𝐨𝐫 – 𝐒𝐲𝐬𝐭𝐞𝐦 𝐃𝐞𝐬𝐢𝐠𝐧 (Remote, part-time)
We’re hiring a system design technical educator to help deepen our educational library. This role is perfect for someone who loves explaining complex engineering topics clearly, whether through long-form articles, diagrams, or short-form posts.
You’ll collaborate with the team to write newsletters, coauthor chapters of books and guides, and create engaging visual content around system design, architecture patterns, scalability, and more. If you’ve written for blogs, docs, newsletters, or taught online, we’d love to see your work.
𝐒𝐚𝐥𝐞𝐬/𝐏𝐚𝐫𝐭𝐧𝐞𝐫𝐬𝐡𝐢𝐩 (US/Canada based remote role, part-time/full-time)
We’re looking for a sales and partnerships specialist who will help grow our newsletter sponsorship business. This role will focus on securing new advertisers, nurturing existing relationships, and optimizing revenue opportunities across our newsletter and other media formats.
How to Apply: send your resume and a short note on why you’re excited about this role to [email protected]
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].
2025-05-15 23:30:30
Imagine a ride-sharing app that shows a driver’s location with a few seconds of delay. Now, imagine if the entire app refused to show anything until every backend service agreed on the perfect current location. No movement, no updates, just a spinning wheel.
That’s what would happen if strong consistency were always preferred in a distributed system.
Modern applications (social feeds, marketplaces, logistics platforms) don’t run on a single database or monolithic backend anymore. They run on event-driven, distributed systems. Services publish and react to events. Data flows asynchronously, and components update independently. This decoupling unlocks flexibility, scalability, and resilience. However, it also means consistency is no longer immediate or guaranteed.
This is where eventual consistency becomes important.
Some examples are as follows:
A payment system might mark a transaction as pending until multiple downstream services confirm it.
A feed service might render posts while a background job deduplicates or reorders them later.
A warehouse system might temporarily oversell a product, then issue a correction as inventory updates sync across regions.
These aren’t bugs but trade-offs.
Eventual consistency lets each component do its job independently, then reconcile later. It prioritizes availability and responsiveness over immediate agreement.
This article explores what it means to build with eventual consistency in an event-driven world. It breaks down how to deal with out-of-order events and how to design systems that can handle delays.
2025-05-13 23:30:31
Like it or not, your API has a new user: AI agents. Make accessing your API services easy for them with an MCP (Model Context Protocol) server. Speakeasy uses your OpenAPI spec to generate an MCP server with tools for all your API operations to make building agentic workflows easy.
Once you've generated your server, use the Speakeasy platform to develop evals, prompts and custom toolsets to take your AI developer platform to the next level.
Disclaimer: The details in this post have been derived from the articles/videos shared online by the Slack engineering team. All credit for the technical details goes to the Slack Engineering Team. The links to the original articles and videos are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.
Most people think of Slack as a messaging app. It is technically accurate, but from a systems perspective, it's more like a real-time, multiplayer collaboration platform with millions of concurrent users, thousands of messages per second, and an architecture that evolved under some unusual constraints.
At peak weekday hours, Slack maintains over five million simultaneous WebSocket sessions. That’s not just a metric, but a serious architectural challenge. Each session represents a live, long-running connection, often pushing out typing indicators, presence updates, and messages in milliseconds. Delivering this kind of interactivity on a global scale is hard. Doing it reliably with high performance is even harder.
One interesting trivia is that the team that built Slack was originally building a video game named Glitch: a browser-based MMORPG. While Glitch had a small but passionate audience, it struggled to become financially sustainable. During the development of Glitch, the team created an internal communication tool that would later become Slack. When Glitch shut down, the team recognized the potential of the internal communication tool and began to develop it into a bigger product for business use. The backend for this internal tool became the skeleton of what would become Slack.
This inheritance shaped Slack’s architecture in two key ways:
Separation of concerns: Like game servers manage real-time events separately from game logic, Slack splits its architecture early. One service (the “channel server”) handled real-time message propagation. Another (the “web app”) managed business logic, storage, and user auth.
Push-first mentality: Unlike traditional request-response apps, Glitch pushed updates as the state changed. Slack adopted this model wholesale. WebSockets weren’t an optimization—they were the foundation.
This article explores how Slack’s architecture evolved to meet the demands of a system that makes real-time collaboration possible across organizations of 100,000+ people.
Slack made real-time collaboration seamless for teams. CodeRabbit brings that same spirit to code reviews. It analyzes every PR using context-aware AI that understands your codebase suggesting changes, catching bugs and even asking questions when something looks off. Perfect for fast-moving teams who want quality code reviews without slowing down. Integrated with GitHub, GitLab and it's like a senior engineer reviewing with you on every commit. Free for Open-source.
Slack’s early architecture was a traditional monolithic backend fused with a purpose-built, real-time message delivery system.
The monolith, written in Hacklang, handled the application logic. Hacklang (Facebook’s typed dialect of PHP) offered a pragmatic path: move fast with a familiar scripting language, then gradually tighten things with types. For a product iterating quickly, that balance paid off. Slack’s backend handled everything from file permissions to session management to API endpoints.
But the monolith didn’t touch messages in motion. That job belonged to a real-time message bus: the channel server, written in Java. The channel server pushed updates over long-lived WebSocket connections, broadcast messages to active clients, and arbitrated message order. When two users hit “send” at the same moment, it was the channel server that decided which message came first.
Here’s how the division looked in terms of functionalities:
Web App (Hacklang)
Auth, permissions, and storage
API endpoints and job queuing
Session bootstrapping and metadata lookup
Channel Server (Java)
WebSocket handling
Real-time message fan-out
Typing indicators, presence blips, and ordering guarantees
This split worked well when Slack served small teams and development moved fast. But over time, the costs surfaced:
The monolith grew brittle as testing got harder and deployment risk went up.
The channel server held state, which complicated recovery and scaling.
Dependencies between the two made failures messy. If the web app went down, the channel server couldn’t persist messages, but might still tell users they’d sent them.
Messaging apps live or die by trust. When someone sends a message and sees it appear on screen, they expect it to stay there and to show up for everyone else. If that expectation breaks, the product loses credibility fast. In other words, persistence becomes a foundational feature.
Slack’s design bakes this in from the start. Unlike Internet Relay Chat (IRC), where messages vanish the moment they scroll off-screen, Slack assumes every message matters, even the mundane ones. It doesn’t just aim to display messages in real-time. It aims to record them, index them, and replay them on demand. This shift from ephemeral to durable changes everything.
IRC treats each message like a radio transmission, whereas Slack treats messages like emails. If the user missed something, they can always scroll up, search later, and re-read at a later date. This shift demands a system that guarantees:
Messages don’t disappear
Message order stays consistent
What one user sees, every user sees
Slack delivers that through what looks, at first glance, like a simple contract:
When a message shows up in a channel, everyone should see it.
When a message appears in the UI, it should be in stable storage.
When clients scroll back, they should all see the same history, in the same order.
This is a textbook case of atomic broadcast.
Atomic broadcast is a classic problem in distributed systems. It's a formal model where multiple nodes (or users) receive the same messages in the same order, and every message comes from someone. It guarantees three core properties:
Validity: If a user sends a message, it eventually gets delivered.
Integrity: No message appears unless it was sent.
Total Order: All users see messages in the same sequence.
Slack implements a real-world approximation of atomic broadcast because it was essential for their functionality. Imagine a team seeing different sequences of edits, or comments that reference messages that “don’t exist” on someone else’s screen.
But here’s the twist: in distributed systems, atomic broadcast is as hard as consensus. And consensus, under real-world failure modes, is provably impossible to guarantee. So Slack, like many production systems, takes the pragmatic path. It relaxes constraints, defers work, and recovers from inconsistency instead of trying to prevent it entirely.
This tension between theoretical impossibility and practical necessity drives many of Slack’s architectural decisions.
In real-time apps, low latency is a necessity. When a user hits “send,” the message should appear instantly. Anything slower breaks the illusion of conversation. But making that feel snappy while also guaranteeing that the message is stored, ordered, and replayable? That’s where things get messy.
Slack’s original message send flow prioritized responsiveness. The architecture puts the channel server (the real-time message bus) at the front of the flow. A message went from the client straight to the channel server, which then:
Broadcast it to all connected clients
Sent an acknowledgment back to the sender
Later handed it off to the web app for indexing, persistence, and other deferred work
This gave users lightning-fast feedback. However, it also introduced a dangerous window: the server might crash after confirming the message but before persisting it. To the sender, the message looked “sent.” To everyone else, especially after a recovery, it might be gone.
This flow worked, but it carried risk:
Stateful servers meant complex failover logic and careful coordination.
Deferred persistence meant the UI could technically lie about message delivery.
Retries and recovery had to reconcile what was shown vs. what was saved.
Slack patched around this with persistent buffers and retry loops. But the complexity was stacking up. The system was fast, but fragile.
As Slack matured, and as outages and scale pushed the limits, the team reversed the flow.
In the new send model, the web app comes first:
The client sends the message via HTTP POST to the web app
The web app logs the message to the job queue (persistence, indexing, and parsing all happen here)
Only then does it invoke the channel server to broadcast the message in real-time
This change improves several things:
Crash safety: If anything goes down mid-flow, either the message persists or the client gets a clear failure.
Stateless channel servers: Without needing local buffers or retries, they become easier to scale and maintain.
Latency preserved: Users still see messages immediately, because the real-time broadcast happens fast, even while persistence continues in the background.
And one subtle benefit: the new flow doesn’t require a WebSocket connection to send a message. That’s a big deal for mobile clients responding to notifications, where setting up a full session just to reply was costly.
The old system showed messages fast, but sometimes dropped them. The new one does more work up front, but makes a stronger promise in terms of persistence.
For small teams, starting a Slack session looks simple. The client requests some data, connects to a WebSocket, and starts chatting. However, at enterprise scale, that “simple” startup becomes a serious architectural choke point.
Originally, Slack used a method called RTM Start (Real-Time Messaging Start). When a client initiated a session, the web app assembled a giant JSON payload: user profiles, channel lists, membership maps, unread message counts, and a WebSocket URL. This was meant to be a keyframe: a complete snapshot of the team’s state, so the client could start cold and stay in sync via real-time deltas.
It worked until teams got big.
For small teams (under 100 users), the payload was lightweight.
For large organizations (10,000+ users), it ballooned to tens of megabytes.
Clients took tens of seconds just to parse the response and build local caches.
If a network partition hit, thousands of clients would reconnect at once, slamming the backend with redundant work.
And it got worse:
Payload size grew quadratically with team size. Every user could join every channel, and the web app calculated all of it.
All this work happened in a single data center, creating global latency for users in Europe, Asia, or South America.
This wasn’t just slow. It was a vector for cascading failure. One bad deploy or dropped connection could take out Slack’s control plane under its load.
To fix this, Slack introduced Flannel, a purpose-built microservice that acts as a stateful, geo-distributed cache for session bootstrapping.
Instead of rebuilding a fresh session snapshot for every client on demand, Flannel does a couple of things differently:
It maintains a pre-warmed in-memory cache of team metadata
Listens to WebSocket events to keep that cache up to date, just like a client would
It serves session boot data locally, from one of many regional replicas
Sits astride the WebSocket connection, terminating it and handling session validation
Here’s what changes in the flow:
A client connects to Flannel and presents its auth token.
Flannel verifies the token (delegating to the web app if needed).
If the cache is warm, it sends a hello response immediately. No need to hit the origin.
This flips the cost model from compute-heavy startup to cache-heavy reuse. While it’s tempting to think that Flannel adds complexity. But Slack found that at scale, complexity that’s predictable and bounded is better than simplicity that breaks under pressure.
Every system seems to work on the whiteboard. The real test comes when it’s live, overloaded, and something fails. At Slack’s scale, maintaining reliable real-time messaging isn’t just about handling more messages per second. It’s also about absorbing failure without breaking user expectations.
One of the most visible symptoms at scale is message duplication. Sometimes a user sees their message posted twice. It’s not random. It’s a side effect of client retries.
Here’s how it happens:
A mobile client sends a message.
Network flakiness delays the acknowledgment.
The client times out and retries.
Both messages make it through, or one makes it twice, and now the user wonders what just happened.
To survive this, Slack leans on idempotency. Each message includes a client-generated ID or salt. When the server sees the same message ID again, it knows it’s not a new send. This doesn’t eliminate all duplication, especially across devices, but it contains the damage.
On the backend, retries and failures get more serious. A message might:
Reach the channel server but fail to persist
Persist to the job queue but never push
Push to some clients and not others
The system has to detect and recover from all of these without losing messages, breaking order guarantees, and flooding the user with confusing errors.
This is where queueing architecture matters. Slack uses Kafka for durable message queuing and Redis for in-flight, fast-access job data. Kafka acts as the system’s ledger and Redis provides short-term memory.
This separation balances:
Durability vs. speed: Kafka holds the truth; Redis handles the work-in-progress.
Retry logic: Jobs pulled from Kafka can be retried intelligently if processing fails.
Concurrency control: The system avoids processing the same message twice and waiting forever for a stuck job.
Slack’s architecture isn’t simple, and that’s by design. The system embraces complexity in the places where precision matters most: real-time messaging, session consistency, and user trust. These are the end-to-end paths where failure is visible, consequences are immediate, and user perception can shift in a heartbeat.
The architecture reflects a principle that shows up in high-performing systems again and again: push complexity to the edge, keep the core fast and clear. Channel servers, Flannel caches, and job queues each exist to protect a smooth user experience from the messiness of distributed systems, partial failures, and global scale.
At the same time, the parts of the system that don’t need complexity, like storage coordination or REST API responses, stay lean and conventional.
Ultimately, no architecture stands still. Every scaling milestone, every user complaint, every edge case pushes the system to adapt. Slack’s evolution from monolith-plus-bus to globally distributed microservices wasn’t planned in a vacuum. It came from running into real limits, then designing around them.
The lesson isn’t to copy Slack’s architecture. It’s to respect the trade-offs it reveals:
Optimize for latency, but tolerate slowness in the right places.
Build around failure, not away from it.
Embrace complexity where correctness pays for itself, and fight to simplify the rest.
References:
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].
2025-05-10 23:30:27
Manual testing on personal devices is too slow and too limited. It forces teams to cut releases a week early just to test before submitting them to app stores. And without broad device coverage, issues slip through.
QA Wolf gets mobile apps to 80% automated test coverage in weeks. They create and maintain your test suite in Appium (no vendor lock-in) — and provide unlimited, 100% parallel test runs with zero flakes.
✅ QA cycles reduced from hours to less than 15 minutes
✅ Tests run on real iOS devices and Android emulators
✅ Flake-free runs, no false positives
✅ Human-verified bug reports
No more manual E2E testing. No more slow QA cycles. No more bugs reaching production.
Rated 4.8/5 ⭐ on G2
This week’s system design interview:
9 Clean Code Principles To Keep In Mind
The 4 Types of SQL Joins
How to Learn Cloud Computing?
Visualizing a SQL query
Explaining JSON Web Token (JWT) with simple terms
SPONSOR US
Meaningful Names: Name variables and functions to reveal their purpose, not just their value.
One Function, One Responsibility: Functions should do one thing.
Avoid Magic Numbers: Replace hard-code values with named constants to give them meaning.
Use Descriptive Booleans: Boolean names should state a condition, not just its value.
Keep Code DRY: Duplicate code means duplicate bugs. Try and reuse logic where it makes sense.
Avoid Deep Nesting: Flatten your code flow to improve clarity and reduce cognitive load.
Comment Why, Not What: Explain the intention behind your code, not the obvious mechanics.
Limit Function Arguments: Too many parameters confuse. Group related data into objects.
Code Should Be Self-Explanatory: Well-written code needs fewer comments because it reads like a story.
Over to you: Which other clean code principle will you add to the list?
The zero-to-one guide for teams adopting AI coding assistants. This guide shares proven prompting techniques, the use cases that save the most time for developers, and leadership strategies for encouraging adoption. It’s designed to be something engineering leaders can distribute internally to help teams get started with integrating AI into their daily work.
Download this guide to get:
The 10 most time-saving use cases for AI coding tools
Effective AI prompting techniques from experienced AI users
Leadership strategies for encouraging AI use
SQL joins combine rows from two or more tables based on a related column. Here are the different types of joins you can use:
Inner Join
Returns only the matching rows between both tables. It keeps common data only.
Left Join
Returns all rows from the left table and matching rows from the right table. If a row in the left table doesn’t have a match in the right table, the right table’s columns will contain NULL values in that row.
Right Join
Returns all rows from the right table and matching rows from the left table. If no matching record exists in the left table for a record in the right table, the columns from the left table in the result will contain NULL values.
FULL OUTER JOIN
Returns all rows from both tables, filling in NULL for missing matches.
Over to you: Which SQL Join have you used the most?
Cloud computing is a vast field with an ever-growing footprint. It can often get tricky for a new developer to understand where to start. Here’s a learning map:
Cloud Computing Basics
This includes topics such as “what is cloud computing,” its benefits, cloud models (public, private, hybrid, and multi), and a comparison of cloud vs. on-premise.
Cloud Service Models
Learn about cloud service models such as IaaS, PaaS, and SaaS.
Cloud Providers
Explore the various popular cloud platforms such as AWS, Azure, GCP, Oracle Cloud, IBM Cloud, etc. Also, learn how to choose a cloud provider.
Key Cloud Services
Learn the key cloud services related to Compute (EC2, Azure VM, Docker, Kubernetes, Lambda, etc), Storage (EBS, Azure Disk, S3, Azure Blob, EFS, etc.), and Networking (VPC, ELB, Azure LB, Cloudfront, and Azure CDN).
Security & Compliance
Learn about the critical security and compliance points related to identity, access management, encryption, data security, DDoS protection, and WAF.
Cloud DevOps & Automation
Learn Cloud DevOps and automation in specific areas such as CI/CD, IaC, and Monitoring.
Over to you: What else will you add to the list for learning cloud computing?
SQL statements are executed by the database system in several steps, including:
Parsing the SQL statement and checking its validity
Transforming the SQL into an internal representation, such as relational algebra
Optimizing the internal representation and creating an execution plan that utilizes index information
Executing the plan and returning the results
Imagine you have a special box called a JWT. Inside this box, there are three parts: a header, a payload, and a signature.
The header is like the label on the outside of the box. It tells us what type of box it is and how it's secured. It's usually written in a format called JSON, which is just a way to organize information using curly braces { } and colons : .
The payload is like the actual message or information you want to send. It could be your name, age, or any other data you want to share. It's also written in JSON format, so it's easy to understand and work with.
Now, the signature is what makes the JWT secure. It's like a special seal that only the sender knows how to create. The signature is created using a secret code, kind of like a password. This signature ensures that nobody can tamper with the contents of the JWT without the sender knowing about it.
When you want to send the JWT to a server, you put the header, payload, and signature inside the box. Then you send it over to the server. The server can easily read the header and payload to understand who you are and what you want to do.
Over to you: When should we use JWT for authentication? What are some other authentication methods?
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].