MoreRSS

site iconByteByteGoModify

System design and interviewing experts, authors of best-selling books, offer newsletters and courses.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of ByteByteGo

EP171: The Generative AI Tech Stack

2025-07-12 23:30:32

The Ultimate Weapon for Software Architects (Sponsored)

What if you could instantly identify the only 3 functionalities to test for non-regression after modifying a complex Java class?

What if you could visualize the ripple effect, from database to front-end, of changing a column data type?

Master your application with the architects’ ultimate weapon

CAST Imaging automatically maps any application’s inner workings:

  • Visualize all dependencies & explore database access

  • Trace end-to-end data & call flows, assess change impact

  • Identify structural flaws typically missed by code quality tools

Stop wasting countless hours reverse-engineering your code manually.
Move faster with CAST Imaging, the automated software mapping tech.

MAP YOUR APPLICATION - FREE TRIAL

CAST Imaging supports any mix of Java/JEE, .NET, Python, COBOL, SQL, and 100+ other languages, frameworks, and database engines.


This week’s system design refresher:

  • The Generative AI Tech Stack

  • 24 Good Resources to Learn Software Architecture in 2025

  • ByteByteGo Technical Interview Prep Kit

  • Database Index Types Every Developer Should Know

  • The Agentic AI Learning Roadmap

  • 12 MCP Servers You Can Use in 2025

  • SPONSOR US


The Generative AI Tech Stack

GenAI refers to systems capable of creating new content, such as text, images, code, or music, by learning patterns from existing data. Here are the key building blocks for GenAI Tech Stack:

  1. Cloud Hosting & Inference: Providers like AWS, GCP, Azure, and Nvidia offer the infrastructure to run and scale AI workloads.

  2. Foundational Models: Core LLMs (such as GPT, Claude, Mistral, Llama, Gemini, Deepseek) trained on massive data, form the base for all GenAI applications.

  3. Frameworks: Tools like LangChain, PyTorch, and Hugging Face help build, deploy, and integrate models into apps.

  4. Databases and Orchestration: Vector DBs (such as Pinecone, Weaviate), orchestration tools (such as LangChain, LlamaIndex) manage memory, retrieval, and logic flow.

  5. Fine-Tuning: Platforms like Weights & Biases, OctoML, and Hugging Face enable training models for specific tasks or domains.

  6. Embeddings and Labeling: Services like Cohere, Scale AI, Nomic, and JinaAI help generate and label vector representations to power search and RAG systems.

  7. Synthetic Data: Tools like Gretel, Tonic AI, and Mostly AI create artificial datasets to enhance training.

  8. Model Supervision: Monitor model performance, bias, and behavior. Tools such as Fiddler, Helicone, and WhyLabs help.

  9. Model Safety: Helps ensure ethical, secure, and safe deployment of GenAI systems. Solutions like LLM Guard, Arthur AI, and Garak help with this.

Over to you: What else will you add to this list?


How Top Fintech Engineering Teams Actually Ship (Sponsored)

fintech_devcon is the technical conference where real builders share how they ship complex systems at scale.

Hear directly from engineers at Amazon, Block, Plaid, Chime, DocuSign, Gusto, and more.

Topics include:

💸Ledgers + payment infra
🔐 Onboarding, auth, and security
🧠 LLMs + fraud detection
🧰 Dev tooling + API design
💥 No sales pitches; just architecture diagrams, code, and battle scars

This is where fintech’s toughest engineering challenges get unpacked.

It’s all happening August 4–6 in Denver.

🎟️ Use BBG25 to save $195.

Get your ticket


24 Good Resources to Learn Software Architecture in 2025

The resources can be divided into different types such as:

  1. Software Design Books
    Some books that can help are DDIA, System Design Volume 1 & 2, Clean Architecture, Domain-Driven Design, and Software Architecture: the Hard Parts

  2. Tech Blogs and Newsletters
    Read technical blogs by companies like Netflix, Uber, Meta, and Airbnb. Also, the ByteByteGo newsletter provides insights into software design every week.

  3. YouTube Channels and Architectural Resources
    YouTube channels like MIT Distributed Systems, Goto Conferences, and ByteByteGo can help with software architecture and system design. Azure Architecture Center and AWS Architecture Blog are other important resources.

  4. WhitePapers
    For deeper insights, read whitepapers like Facebook Memcache Scaling, Cassandra, Amazon DynamoDB, Kafka, and Google File System.

  5. Software Career Books
    A Software Architect also needs to develop holistic skills. Books about software career aspects such as Pragmatic Programmer, The Software Architect Elevator, The Software Engineer's Guidebook, and Philosophy of Software Design can help.

Over to you: Which other resources will you add to the list?


ByteByteGo Technical Interview Prep Kit

Launching the All-in-one interview prep. We’re making all the books available on the ByteByteGo website.

What's included:

  • System Design Interview

  • Coding Interview Patterns

  • Object-Oriented Design Interview

  • How to Write a Good Resume

  • Behavioral Interview (coming soon)

  • Machine Learning System Design Interview

  • Generative AI System Design Interview

  • Mobile System Design Interview

  • And more to come

Launch sale: 50% off


Database Index Types Every Developer Should Know

A database index is a derived structure that maps column values to the physical locations of rows in a table. Let’s look at some key index types:

  1. Primary Index
    This index is automatically created when a primary key is defined on a table. Such an index can be dense as well as sparse, though sparse is preferred in most scenarios.

    A dense index contains one entry for every row in the table. On the other hand, a sparse index contains entries for only some rows in the table.

  2. Clustered Index
    A clustered index determines the physical order of rows in a table. Only one clustered index can exist on a table because data can only be stored in one order at a time. It is great for range queries, ordered scans, and I/O efficiency.

  3. Secondary Index
    A non-clustered index is a separate structure that holds a copy of one or more columns along with pointers to the actual rows in the table. It doesn’t affect how data is physically stored, and it can use the primary index to locate the records.

Over to you: Which other index type will you add to the list?


The Agentic AI Learning Roadmap

  1. An AI Agent is a system capable of autonomous actions, reacting to its environment, using tools (APIs, Internet, code, etc), and can work under human guidance.

  2. To build AI agents, one must know tools like Python, Jupyter, PyTorch, and GitHub Copilot. These enable coding, experimentation, and integration with AI libraries and APIs.

  3. GenAI Foundational Models
    Familiarity with large models like GPT, Gemini, LLaMa, DeepSeek, and Claude is essential. These models provide the base intelligence that agents can use for reasoning, generation, and understanding.

  4. AI Agent Development Stack
    Tools like Langchain, AutoGen, Crew AI, and frameworks like Semantic Kernel and Hugging Face power agent workflows. These components manage tasks, memory, and external tool integrations in agent pipelines.

  5. API Design
    Understanding API design approaches like REST, GraphQL, gRPC, and SOAP is crucial to building interoperable agents. Key concepts include HTTP methods, status codes, versioning, cookies, headers, and caching.

  6. Type of AI Agents
    Learn about the several types of AI agents, such as simple reflex, model-based reflex, goal-based, utility-based, and learning agents. Each varies in complexity.

  7. AI Agent System Architecture
    AI agents can operate as single agents, in multi-agent systems, or in human-machine collaboration. Architecture depends on the use case.

Over to you: What else will you add to the AI Agent Learning Roadmap?


12 MCP Servers You Can Use in 2025

MCP (Model Context Protocol) is an open standard that simplifies how AI models, particularly LLMs, interact with external data sources, tools, and services. An MCP server acts as a bridge between these AI models and external tools. Here are the top MCP servers:

  1. File System MCP Server
    Allows the LLM to directly access the local file system to read, write, and create directories.

  2. GitHub MCP Server
    Connects Claude to GitHub repos and allows file updates, code searching.

  3. Slack MCP Server
    MCP Server for Slack API, enabling Claude to interact with Slack workspaces.

  4. Google Maps MCP Server
    MCP Server for Google Maps API.

  5. Docker MCP Server
    Integrate with Docker to manage containers, images, volumes, and networks.

  6. Brave MCP Server
    Web and local search using Brave’s Search API.

  7. PostgreSQL MCP Server
    An MCP server that enables LLM to inspect database schemas and execute read-only queries.

  8. Google Drive MCP Server
    An MCP server that integrates with Google Drive to allow reading and searching over files.

  9. Redis MCP Server
    MCP Server that provides access to Redis databases.

  10. Notion MCP Server
    This project implements an MCP server for the Notion API.

  11. Stripe MCP Server
    MCP Server to interact with the Stripe API.

  12. Perplexity MCP Server
    An MCP Server that connects to Perplexity’s Sonar API for real-time search.

Over to you: Which other MCP Server will you add to the list?


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].

Database Index Internals: Understanding the Data Structures

2025-07-10 23:30:34

Creating an index is easy. Nearly every developer has created or used an index at some point, whether directly or indirectly. But knowing what to index is only one part of the equation. The more difficult question is understanding how the index works underneath.

Indexing isn’t a surface-level optimization. It’s a problem of data structures. The way an index organizes, stores, and retrieves data directly shapes the performance of read and write operations. Different data structures behave differently. 

  • Some may excel at range scans. 

  • Some are optimized for exact-match lookups. 

  • Others are purpose-built for full-text search or geospatial queries. 

These decisions affect everything from query planning to I/O patterns to the amount of memory consumed under load.

When a query slows down or a system starts struggling with disk I/O, the index structure often sits at the heart of the issue. A poorly chosen index format can lead to inefficient access paths, unnecessary bloat, or slow inserts. Conversely, a well-aligned structure can turn a brute-force scan into a surgical lookup.

In this article, we will cover the core internal data structures that power database indexes. Each section will walk through how the structure works, what problems it solves, where it performs best, and what limitations it carries. 

The Role of Index Structures in Query Execution

Read more

How Discord Stores Trillions of Messages with High Performance

2025-07-08 23:30:51

MCP Authorization in 5 Easy OAuth Specs (Sponsored)

Securely authorizing access to an MCP server used to be an open question. Now there's a clear answer: OAuth. It provides a path with five key specs covering delegation, token exchange, and scoped access.

WorkOS packages the full stack into one API, so you can add MCP authorization without building your own OAuth infrastructure.

Implement MCP Auth with WorkOS


Disclaimer: The details in this post have been derived from the articles shared online by the Discord Engineering Team. All credit for the technical details goes to the Discord Engineering Team.  The links to the original articles and sources are present in the references section at the end of the post. We’ve attempted to analyze the details and provide our input about them. If you find any inaccuracies or omissions, please leave a comment, and we will do our best to fix them.

Many chat platforms never reach the scale where they have to deal with trillions of messages. However, Discord does. And when that happens, a somewhat manageable data problem can quickly turn into a major engineering challenge that involves millions of users sending messages across millions of channels. 

At this scale, even the smallest architectural choices can have a big impact. Things like hot partitions can turn into support nightmares. Garbage collection pauses aren’t just annoying, but can lead to system-wide latency spikes. The wrong database design can lead to wastage of developer time and operational bandwidth.

Discord’s early database solution (moving from MongoDB to Apache Cassandra®) promised horizontal scalability and fault tolerance. It delivered both, but at a significant operational cost. Over time, keeping Apache Cassandra® stable required constant firefighting, careful compaction strategies, and JVM tuning. Eventually, the database meant to scale with Discord had become a bottleneck.

In this article, we will walk through how Discord rebuilt its message storage layer from the ground up. We will learn the issues Discord faced with Apache Cassandra® and their shift to ScyllaDB. Also, we will look at the introduction of Rust-based data services to shield the database from overload and improve concurrency handling.


Go from Engineering to AI Product Leadership (Sponsored)

As an engineer or tech lead, you know how to build complex systems. But how do you translate that technical expertise into shipping world-class AI products? The skills that define great AI product leaders—from ideation and data strategy to managing LLM-powered roadmaps—are a different discipline.

This certification is designed for technical professionals. Learn directly from Miqdad Jaffer, Product Leader at OpenAI, in the #1 rated AI certificate on Maven. You won't just learn theory; you will get hands-on experience developing a capstone project and mastering the frameworks used to build and scale products in the real world.

Exclusive for ByteByteGo Readers: Use code BBG500 to save $500 before the next cohort sells out.

View Course & Save $500


Initial Architecture

Discord's early message storage relied on Apache Cassandra®. The schema grouped messages by channel_id and a bucket, which represented a static time window. 

This schema allowed for efficient lookups of recent messages in a channel, and Snowflake IDs provided natural chronological ordering. A replication factor of 3 ensured each partition existed on three separate nodes for fault tolerance.

Within each partition, messages were sorted in descending order by message_id, a Snowflake-based 64-bit integer that encoded creation time.

The diagram below shows the overall partitioning strategy based on the channel ID and bucket.

At a small scale, this design worked well. However, scale often introduces problems that don't show up in normal situations.

Apache Cassandra® favors write-heavy workloads, which aligns well with chat systems. However,  high-traffic channels with massive user bases can generate orders of magnitude more messages than quiet ones. 

A few things started to go wrong at this point:

  • Hot partitions emerged when a popular channel received a surge of reads or writes. Since Apache Cassandra® was configured to partition data by channel_id and bucket, one node can potentially get overloaded trying to serve queries for that partition. These hot spots increased latency not just locally but across the cluster.

  • Reads had to check in-memory memtables and potentially scan across multiple on-disk SSTables. As data grew and compaction lagged, reads became slower and more expensive.

The diagram below shows the concept of hot partitions:

Performance wasn't the only issue. Operational overhead also ballooned.

  • Compactions, which merge and rewrite SSTables, fell behind, increasing disk usage and degrading read performance. The team often had to take nodes out of rotation and babysit them back to health.

  • JVM garbage collection became a recurring source of instability. GC pauses could stretch long enough to cause timeouts or trigger failovers. Tuning heap sizes and GC parameters became a full-time task for on-call engineers.

  • As the message load grew, so did the cluster. What began with 12 nodes eventually reached 177 nodes, each one a moving part that required care and coordination.

At this point, Apache Cassandra® was being scaled manually, by throwing more hardware and more engineer hours at the problem. The system was running, but it was clearly under strain.

Switching to ScyllaDB

ScyllaDB entered the picture as a natural alternative. It preserved compatibility with the query language and data model of Apache Cassandra®, which meant the surrounding application logic could remain largely unchanged. However, under the hood, the execution model was very different.

Some key characteristics were as follows:

  • ScyllaDB is written in C++, which eliminates the need for a garbage collector. That immediately removed one of the most painful sources of latency in the Apache Cassandra® setup.

  • It uses a shard-per-core architecture, assigning each CPU core its subset of data and handling requests independently. This design improved isolation between workloads and avoided many of the coordination bottlenecks seen in multi-threaded JVM-based systems.

  • Repair operations and consistency handling were performed more efficiently, thanks to lower overhead in ScyllaDB’s internals.

  • One key blocker during evaluation was reverse query performance. Discord’s message history scanning sometimes requires scanning in ascending order, the opposite of the default descending sort. Initially, ScyllaDB struggled with this use case. However, the ScyllaDB engineering team prioritized improvements, making the operation fast enough for production needs.

Overall, ScyllaDB offered the same interface with a far more predictable runtime. 

Rust-Based Data Services Layer

To reduce direct load on the database and prevent repeated query amplification, Discord also introduced a dedicated data services layer. These services act as intermediaries between the main API monolith and the ScyllaDB clusters. They are responsible solely for data access and coordination, and no business logic is embedded here.

The goal behind them was simple: isolate high-throughput operations, control concurrency, and protect the database from accidental overload.

Rust was chosen for the data services for both technical and operational reasons. This is because it brings together low-level performance and modern safety guarantees.

Some key advantages of choosing Rust are as follows:

  • Native performance comparable to C and C++, which is critical in latency-sensitive paths.

  • Safe concurrency through Rust’s ownership and type system. This avoids the class of bugs common in multithreaded C++ or Java systems.

  • Asynchronous I/O is powered by the Tokio ecosystem, which allows efficient handling of thousands of simultaneous requests without blocking threads.

  • Native drivers for both Apache Cassandra® and ScyllaDB, enabling direct, efficient access to the underlying data.

Each data service exposes gRPC endpoints that map one-to-one with database queries. This keeps the architecture clean and transparent. The services do not embed any business logic. They are designed purely for data access and efficiency.

Request Coalescing

One of the most important features in this layer is request coalescing.

When multiple users request the same piece of data, such as a popular message in a high-traffic channel, the system avoids hammering the database with duplicate queries.

  • The first incoming request triggers a worker task that performs the database query.

  • Any subsequent requests for the same data check for an active worker and subscribe to its result instead of issuing a new query.

  • Once the database responds, the result is broadcast to all subscribers, completing all requests with a single round trip to the database.

See the diagram below:

To support this pattern at scale, the system uses consistent hash-based routing. Requests are routed using a key, typically the channel_id. This allows all traffic for the same channel to be handled by the same instance of the data service. 

Ultimately, the Rust-based data services help offload concurrency and coordination away from the database. They flatten spikes in traffic, reduce duplicated load, and provide a stable interface to ScyllaDB. 

The result for Discord was higher throughput, better latency under load, and fewer emergencies during traffic surges.

Migration Strategy

Migrating a database that stores trillions of messages is not a trivial problem. The primary goals of this migration were clear:

  • There should be no downtime. The messaging system had to remain fully available throughout the process.

  • The migration should have high throughput. The process had to be completed quickly. The longer two systems remain active, the higher the complexity and risk.

The entire migration process was divided into phases:

Phase 1: Dual Writes with a Cutover Point

The team began by setting up dual writes. Every new message was written to both Apache Cassandra® and ScyllaDB. A clear cutover timestamp defined which data belonged to the "new" world and which still needed to be migrated from the "old."

This allowed the system to adopt ScyllaDB for recent data while leaving historical messages intact in Apache Cassandra® until the backfill completed.

Phase 2: Historical Backfill Using Spark

The initial plan for historical migration relied on ScyllaDB’s Spark-based migrator. 

This approach was stable but slow. Even after tuning, the projected timeline was three months to complete the full backfill. That timeline wasn't acceptable, given the ongoing operational risks with Apache Cassandra®.

Phase 3: A Rust-Powered Rewrite

Instead of accepting the delay, the team extended their Rust data service framework to handle bulk migration. This new custom migrator:

  • Read token ranges from Apache Cassandra®, identifying contiguous data blocks.

  • Stored progress in SQLite, allowing for checkpoints and resumability.

  • Wrote directly into ScyllaDB using high-throughput, concurrent write operations.

The result was a dramatic improvement. The custom migrator achieved a throughput of 3.2 million messages per second, reducing the total migration time from months to just 9 days. This change also simplified the plan. With fast migration in place, the team could migrate everything at once instead of splitting logic between "old" and "new" systems.

Final Step: Validation and Cutover

To ensure data integrity, a portion of live read traffic was mirrored to both databases, and the responses were compared. Once the system consistently returned matching results, the final cutover was scheduled.

In May 2022, the switch was flipped. ScyllaDB became the primary data store for Discord messages.

Post-Migration Results

After the migration, the system footprint shrank significantly. The Apache Cassandra® cluster had grown to 177 nodes to keep up with storage and performance demands. ScyllaDB required only 72 nodes to handle the same workload.

This wasn’t just about node count. Each ScyllaDB node ran with 9 TB of disk space, compared to an average of 4 TB on Apache Cassandra® nodes. The combination of higher density and better performance per node translated into lower hardware and maintenance overhead.

Latency Improvements

The performance gains were clear and measurable.

  • The historical message reads, previously unpredictable became consistent. The earlier p99 latency ranged between 40 and 125 milliseconds, depending on compaction status and read amplification. ScyllaDB brought that down to a steady 15 milliseconds at p99.

  • Message inserts latency flattened out. Apache Cassandra® showed p99 insert latencies between 5 and 70 milliseconds, while ScyllaDB held stable at 5 milliseconds.

Operational Stability

One of the biggest wins was operational calm.

  • There was no more need for GC tuning. With ScyllaDB written in C++, the team no longer had to spend hours tweaking JVM settings to avoid unpredictable pauses.

  • Weekend firefights became less frequent. Compaction backlogs and hot partitions during the earlier design regularly triggered alerts and manual interventions. ScyllaDB with the data services handled the same load without the need for such interventions.

  • The system gained enough headroom to support new product features and larger bursts of traffic. The database no longer held back application growth.

Conclusion

The real test of any system comes when traffic patterns shift from expected to chaotic. During the 2022 FIFA World Cup Final, Discord’s message infrastructure experienced exactly that kind of stress test and passed cleanly.

As Argentina and France battled through regular time, extra time, and penalties, user activity surged across the platform. Each key moment (goals by Messi, Mbappé, the equalizers, the shootout) created massive spikes in message traffic, visible in monitoring dashboards almost in real time. 

Message sends surged, and read traffic ballooned. The kind of workload that used to trigger hot partitions and paging alerts during the earlier design now ran smoothly. Some key takeaways were as follows:

  • Rust-based data services absorbed the concurrent request load through coalescing and consistent routing.

  • ScyllaDB sustained peak throughput with stable latencies.

  • Engineers didn’t need to intervene. The platform stayed quiet and responsive throughout one of the most globally watched sporting events in history.

Note: Apache Cassandra® is a registered trademark of the Apache Software Foundation.

References:


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].

EP170: All-in-One Technical Interview Prep Kit

2025-07-05 23:30:35

😘 Kiss bugs goodbye with fully automated end-to-end test coverage (Sponsored)

Bugs sneak out when less than 80% of user flows are tested before shipping. However, getting that kind of coverage (and staying there) is hard and pricey for any team.

QA Wolf’s AI-native service provides high-volume, high-speed test coverage for web and mobile apps, reducing your organizations QA cycle to less than 15 minutes.

They can get you:

Engineering teams move faster, releases stay on track, and testing happens automatically—so developers can focus on building, not debugging.

Drata’s team of 80+ engineers achieved 4x more test cases and 86% faster QA cycles.

⭐ Rated 4.8/5 on G2

Schedule a demo to learn more


This week’s system design refresher:

  • All-in-One ByteByteGo Technical Interview Prep Kit

  • Best ways to test system functionality

  • How CQRS Works

  • How MongoDB Works

  • Who’s Hiring Now

  • SPONSOR US


ByteByteGo Technical Interview Prep Kit

Launching the All-in-one interview prep. We’re making all the books available on the ByteByteGo website.

What's included:

  • System Design Interview

  • Coding Interview Patterns

  • Object-Oriented Design Interview

  • How to Write a Good Resume

  • Behavioral Interview (coming soon)

  • Machine Learning System Design Interview

  • Generative AI System Design Interview

  • Mobile System Design Interview

  • And more to come

Launch sale: 50% off


Best ways to test system functionality

Testing system functionality is a crucial step in software development and engineering processes.

It ensures that a system or software application performs as expected, meets user requirements, and operates reliably.

Here we delve into the best ways:

  1. Unit Testing: Ensures individual code components work correctly in isolation.

  2. Integration Testing: Verifies that different system parts function seamlessly together.

  3. System Testing: Assesses the entire system's compliance with user requirements and performance.

  4. Load Testing: Tests a system's ability to handle high workloads and identifies performance issues.

  5. Error Testing: Evaluates how the software handles invalid inputs and error conditions.

  6. Test Automation: Automates test case execution for efficiency, repeatability, and error reduction.

Over to you: How do you approach testing system functionality in your software development or engineering projects?

Over to you: what's your company's release process look like?


How CQRS Works?

CQRS (Command Query Responsibility Segregation) separates write (Command) and read (Query) operations for better scalability and maintainability.

Here’s how it works:

1 - The client sends a command to update the system state. A Command Handler validates and executes logic using the Domain Model.

2 - Changes are saved in the Write Database and can also be saved to an Event Store. Events are emitted to update the Read Model asynchronously.

3 - The projections are stored in the Read Database. This database is eventually consistent with the Write Database.

4 - On the query side, the client sends a query to retrieve data.

5 - A Query Handler fetches data from the Read Database, which contains precomputed projections.

6 - Results are returned to the client without hitting the write model or the write database.

Over to you: What else will you add to understand CQRS?


How MongoDB Works?

MongoDB is a popular NoSQL database designed for flexibility, scalability, and high performance. It stores data in a JSON-like format (BSON) and supports horizontal scaling through sharding and replication.

Here’s how it works:

  1. Client application connects via a MongoDB Driver to perform read/write operations.

  2. The Query Router (mongos) acts as a mediator, directing queries to the appropriate shard based on the data’s shard key.

  3. Config servers store metadata and routing information. This helps query routers locate data across shards.

  4. The data is distributed across multiple shards to support horizontal scaling.

  5. Each shard is a replica set that consists of one primary node for handling writes and multiple secondary nodes for high availability and read scaling.

  6. If a primary fails, a secondary is automatically elected to replace it and maintain availability.

Over to you: What else will you add to understand MongoDB’s working better?


Hiring Now

We collaborate with Jobright.ai (an AI job search copilot trusted by 500K+ tech professionals) to curate this job list.

This Week’s High-Impact Roles at Fast-Growing AI Startups

High Salary SWE Roles this week

Today’s latest ML positions - hiring now!


SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing [email protected].

A Guide to Database Replication: Key Concepts and Strategies

2025-07-03 23:31:05

Every modern application relies on data, and users expect that data to be fast, current, and always accessible. However, databases are not magic. They can fail or slow down under load. They can also encounter physical and geographic limits, which is where replication becomes necessary. 

Database Replication means keeping copies of the same data across multiple machines. These machines can sit in the same data center or be spread across the globe. The goal is straightforward: 

  • Increase fault tolerance.

  • Scale reads.

  • Reduce latency by bringing data closer to where it's needed.

Replication sits at the heart of any system that aims to survive failures without losing data or disappointing users. Whether it's a social feed updating in milliseconds, an e-commerce site handling flash sales, or a financial system processing global transactions, replication ensures the system continues to operate, even when parts of it break.

However, replication also introduces complexity. It forces difficult decisions around consistency, availability, and performance. The database might be up, but a lagging replica can still serve stale data. A network partition might make two leader nodes think they’re in charge, leading to split-brain writes. Designing around these issues is non-trivial.

In this article, we walk through the concept of replication lag and major replication strategies used in distributed databases today. We will cover single-leader, multi-leader, and leaderless replication models, breaking down how each works, what problems they solve, and where they fall apart. 

Why Replicate Data?

Read more

7 Years, 8 Books, 1 Launch. A lot more to come!

2025-07-02 23:50:25

7 Years, 8 Books, 1 Launch. A lot more to come!

Launching the All-in-one interview prep.

7 years ago, I quit my last job at Twitter. I had no idea what to do next.

For some random reason, I spent 1.5 years writing the first System Design Interview book.

People liked it, so I found a co-author and we wrote System Design Interview – Volume 2 together.

People seemed to like that too, so I decided to collaborate with more amazing authors.

One after another, we’ve successfully published 7 Amazon bestsellers, with more on the way this year.

It’s been a wonderful journey, and I could never have imagined we’d come this far.

Everyone says books don’t make money, but sometimes, certain things are worth more than just money. And I’m grateful we were also able to figure out a business model along the way.

We could never have reached this point without all the amazing people who supported us.

For that, I want to say. Thank you from the bottom of my heart ❤️

Today, we’re making all the books available on the ByteByteGo website.

Check it out here:

Launch sale: 50% off