MoreRSS

site iconThe Practical DeveloperModify

A constructive and inclusive social network for software developers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of The Practical Developer

I built an Open Source Instagram Unfollowers tool (No Login required)

2025-12-03 19:09:46

I built an Open Source Instagram Unfollowers tool (No Login required) 🛡️

Hi everyone! 👋

I was tired of "Unfollowers" apps that require a monthly subscription, show tons of ads, or worse—ask for my Instagram password (which is a huge security risk).

So, being a developer, I decided to build a safe, open-source alternative. I took inspiration from existing community scripts and wrapped them into a modern, user-friendly tool with a brand new Glassmorphism UI and Mobile support.

🚀 What is "Instagram Unfollowers 2025"?

It's a script that runs locally in your browser console. It scans your followers/following lists using your active session and shows you exactly who isn't following you back.

✨ Key Features

  • 🛡️ 100% Safe: It runs on your side (client-side). No password required.
  • ⚡ No Login: It uses your active browser session cookies securely.
  • 📱 Mobile Ready: Works on iOS/Android via a simple bookmarklet trick.
  • 🧠 Smart Logic: Includes "Safe Mode" delays to prevent soft-bans from Instagram.
  • 🎨 Modern UI: Built with Preact and Shadow DOM to inject a beautiful interface without breaking Instagram's layout.

📸 Preview

Instagram Unfollowers Dashboard
(The tool running on Desktop)

🛠️ How to use it

You don't need to install Node.js or clone the repo if you just want to use it.

  1. Go to the official site: https://edvincodes.github.io/InstagramUnfollowers/
  2. Click the "Copy Code" button.
  3. Open Instagram.com on your PC (or Mobile browser).
  4. Open the Developer Console (F12) and paste the code.
  5. Hit Enter and enjoy!

👨‍💻 For Developers (The Tech Stack)

For those interested in the code, the project is built using:

  • TypeScript for type safety.
  • Preact for a lightweight UI inside the browser.
  • Shadow DOM to isolate styles from Instagram's CSS.
  • Webpack to bundle everything into a single copy-pasteable script.

It is completely Open Source. You can audit the code, contribute, or give it a star ⭐️ here:

👉 GitHub Repository: EdvinCodes/InstagramUnfollowers

🐛 Feedback & Support

This project is heavily based on community knowledge, and I want to keep improving it.

If you find any bugs or have suggestions, please let me know in the comments below or email me directly at: [email protected].

Hope it helps you clean up your feed! Happy coding! 🚀

SeaORM 2.0: A closer look

2025-12-03 19:01:24

SeaORM 2.0 Banner

In the previous blog post, we highlighted some of the new features in SeaORM 2.0. In this post, we're going to take a closer look to some of the changes under the hood.

Overhauled Entity::insert_many

#2628 We've received many issue reports around the insert_many API. Previously, insert_many shares the same helper struct with insert_one, which led to an awkard API:

let res = Bakery::insert_many(std::iter::empty())
    .on_empty_do_nothing() // <- you needed to add this,
                           // otherwise insert empty [] would lead to error
    .exec(db)
    .await;

assert!(matches!(res, Ok(TryInsertResult::Empty)));

After careful consideration, we made a number of changes in 2.0:

  1. removed APIs (e.g. Insert::add) that can panic
  2. new helper struct InsertMany, last_insert_id is now Option<Value>
  3. on empty iterator, None (for last_insert_id) or vec![] (when returning) is returned on execution
  4. TryInsert API is unchanged

i.e. now last_insert_id is Option<Value> for InsertMany:

struct InsertManyResult<A: ActiveModelTrait>
{
    pub last_insert_id: Option<<PrimaryKey<A> as PrimaryKeyTrait>::ValueType>,
}

Which means the awkardness is removed:

let res = Entity::insert_many::<ActiveModel, _>([]).exec(db).await;

assert_eq!(res?.last_insert_id, None); // insert nothing return None

let res = Entity::insert_many([ActiveModel { id: Set(1) }, ActiveModel { id: Set(2) }])
    .exec(db)
    .await;

assert_eq!(res?.last_insert_id, Some(2)); // insert something return Some

Exec with returning now returns a Vec<Model>, so it feels intuitive:

assert!(
    Entity::insert_many::<ActiveModel, _>([])
        .exec_with_returning(db)
        .await?
        .is_empty() // no footgun, nice
);

assert_eq!(
    Entity::insert_many([
        ActiveModel {
            id: NotSet,
            value: Set("two".into()),
        }
    ])
    .exec_with_returning(db)
    .await
    .unwrap(),
    [
        Model {
            id: 2,
            value: "two".into(),
        }
    ]
);

Same on conflict API as before:

let res = Entity::insert_many([ActiveModel { id: Set(3) }, ActiveModel { id: Set(4) }])
    .on_conflict_do_nothing()
    .exec(db)
    .await;

assert!(matches!(conflict_insert, Ok(TryInsertResult::Conflicted)));

Overhauled ConnectionTrait API

#2657
We overhauled the ConnectionTrait API. execute, query_one, query_all, stream now takes in SeaQuery statement instead of raw SQL statement.

So you don't have to access the backend to build the query yourself.

// old
let query: SelectStatement = Entity::find().filter(..).into_query();
let backend = self.db.get_database_backend();
let stmt = backend.build(&query);
let rows = self.db.query_all(stmt).await?;

// new
let query: SelectStatement = Entity::find().filter(..).into_query();
let rows = self.db.query_all(&query).await?;

A new set of methods execute_raw, query_one_raw, query_all_raw, stream_raw is added, so you can still do the following:

let backend = self.db.get_database_backend();
let stmt = backend.build(&query);

// new
let rows = self.db.query_all_raw(stmt).await?;

Better error handling in UpdateOne / DeleteOne

#2726 UpdateOne and DeleteOne no longer implement QueryFilter and QueryTrait
directly. Those implementations could expose an incomplete SQL query with an incomplete condition that touches too many records.

// bad: the following is basically update all
let query: UpdateStatement = Update::one(cake::ActiveModel::default()).into_query();

To generate the right condition, we must make sure that the primary key is set on the input ActiveModel by
calling the validate() method:

Update::one(active_model)
  + .validate()? // checks the query; may yield PrimaryKeyNotSet error
    .build(DbBackend::Postgres)

Potential compile errors

If you need to access the generated SQL query, convert into ValidatedUpdateOne/ValidatedDeleteOne first.

error[E0599]: no method named `build` found for struct `query::update::UpdateOne` in the current scope
   --> src/entity/column.rs:607:22
    |
  > | /                 Update::one(active_model)
  > | |                     .build(DbBackend::Postgres)
    | |                     -^^^^^ method not found in `UpdateOne<A>`
    | |_____________________|
    |

Added has_many_via for reverse has many relation

Consider the following entities:

#[derive(Clone, Debug, PartialEq, Eq, DeriveEntityModel)]
#[sea_orm(table_name = "bakery")]
pub struct Model {
    #[sea_orm(primary_key)]
    pub id: i32,
    pub name: String,
    pub manager_id: i32,
    pub cashier_id: i32,
}

#[derive(Copy, Clone, Debug, EnumIter, DeriveRelation)]
pub enum Relation {
    #[sea_orm(
        belongs_to = "super::worker::Entity",
        from = "Column::ManagerId",
        to = "super::worker::Column::Id"
    )]
    Manager,
    #[sea_orm(
        belongs_to = "super::worker::Entity",
        from = "Column::CashierId",
        to = "super::worker::Column::Id"
    )]
    Cashier,
}
#[derive(Clone, Debug, PartialEq, Eq, DeriveEntityModel)]
#[sea_orm(table_name = "worker")]
pub struct Model {
    #[sea_orm(primary_key)]
    pub id: i32,
    pub name: String,
}

There exist two relations between them:

Bakery -> Worker (Manager)
       -> Worker (Cashier)

It's now possible to define the inverse side of the relations in Worker:

#[derive(Clone, Debug, PartialEq, Eq, DeriveEntityModel)]
#[sea_orm(table_name = "worker")]
pub struct Model {
    #[sea_orm(primary_key)]
    pub id: i32,
    pub name: String,
}

#[derive(Copy, Clone, Debug, EnumIter, DeriveRelation)]
pub enum Relation {
    #[sea_orm(has_many = "super::bakery::Entity", via = "Relation::Manager")]
    BakeryManager,
    #[sea_orm(has_many = "super::bakery::Entity", via = "Relation::Cashier")]
    BakeryCashier,
}

These relations can then be used in queries:

assert_eq!(
    worker::Entity::find().join(
        JoinType::LeftJoin,
        worker::Relation::BakeryManager.def(),
    )
    .build(DbBackend::Sqlite)
    .to_string(),
    r#"SELECT "worker"."id", "worker"."name" FROM "worker"
       LEFT JOIN "bakery" ON "worker"."id" = "bakery"."manager_id""#
);

Use of transaction with generic connections

You can already use TransactionTrait as a generic parameter to define functions accepting any connection object that can initiate transactions.

In SeaORM 2.0, there are new database-connection-like objects: RestrictedConnection and RestrictedTransaction. They implement ConnectionTrait and TransactionTrait, and behaves just like normal DatabaseConnections except that they performs additional checks on queries.

Connection type Associated transaction type
DatabaseConnection DatabaseTransaction
RestrictedConnection RestrictedTransaction
// new connection type
pub struct RestrictedConnection {
    conn: DatabaseConnection, // just a wrapper
    user_id: UserId,
}

impl TransactionTrait for RestrictedConnection {
    type Transaction = RestrictedTransaction; // added associated type
}

Meaning the following would continue to work:

// accepts any one of DatabaseConnection / DatabaseTransaction / RestrictedConnection / RestrictedTransaction.
// nested transactions will be spawned for transaction objects
async fn perform_actions<C: TransactionTrait>(
    db: &C,
    actions: &[Action],
) -> Result<(), DbErr> {
    let txn = db.begin().await?;

    for action in actions {
        txn.execute(perform(action)).await?;
    }

    txn.commit().await
}

Removing panics from API

SeaORM has a large API surface. We've already removed a great number of unwraps from the codebase in 1.0 release, but some panics due to "mis-use of API" can still happen.

Once again, we've tried to remove the remaining panics.

  • #2630 Added new error variant BackendNotSupported. Previously, it panics with e.g. "Database backend doesn't support RETURNING"
let result = cake::Entity::insert_many([])
    .exec_with_returning_keys(db)
    .await;

if db.support_returning() {
    // Postgres and SQLite
    assert_eq!(result.unwrap(), []);
} else {
    // MySQL
    assert!(matches!(result, Err(DbErr::BackendNotSupported { .. })));
}
  • #2627 Added new error variant PrimaryKeyNotSet. Previously, it panics with "PrimaryKey is not set"
assert!(matches!(
    Update::one(cake::ActiveModel {
        ..Default::default()
    })
    .exec(&db)
    .await,
    Err(DbErr::PrimaryKeyNotSet { .. })
));
  • #2634 Remove panics in Schema::create_enum_from_active_enum
// method can now return None
fn create_enum_from_active_enum<A>(&self) -> Option<TypeCreateStatement>
  • #2628 Remove panickable APIs from insert
    /// Add a Model to `Insert`
    ///
    /// # Panics
    ///
    /// Panics if the rows have different column sets from what've previously
    /// been cached in the query statement
  - pub fn add<M>(mut self, m: M) -> Self
  • #2637 Remove panics in loader

Enhancements

These are small touch‑ups, but added up they can make a big difference.

Added shorthand for Postgres= ANY

Added ColumnTrait::eq_any as a shorthand for the = ANY operator. Postgres only.

// old: have to import sea-query
use sea_orm::sea_query::{Expr, extension::postgres::PgFunc};

cake::Entity::find()
    .filter(
        // have to qualify column manually
        Expr::col((cake::Entity, cake::Column::Id)).eq(PgFunc::any(vec![4, 5]))
    );

// new: just use sea-orm
assert_eq!(
    cake::Entity::find()
        .filter(cake::Column::Id.eq_any(vec![4, 5]))
        .build(DbBackend::Postgres)
        .to_string(),
    r#"SELECT "cake"."id", "cake"."name" FROM "cake"
       WHERE "cake"."id" = ANY(ARRAY [4,5])"#
);

Added big_pk_auto

// old
pub fn pk_auto<T: IntoIden>(name: T) -> ColumnDef {
    integer(name).auto_increment().primary_key().take()
}

// new: same as above but use big integer
pub fn big_pk_auto<T: IntoIden>(name: T) -> ColumnDef {
    big_integer(name).auto_increment().primary_key().take()
}

Added chrono::Utc to entity prelude

pub type ChronoUtc = chrono::Utc;

We can now rely on sea-orm's re-export:

// old: chrono has to be added in Cargo.toml
let ts: ChronoDateTimeUtc = chrono::Utc::now();
// new: use sea-orm's re-export
let ts: ChronoDateTimeUtc = ChronoUtc::now();

Breaking changes

Use &'static str in identifiers

#2667 Changed IdenStatic and EntityName definition. This change stemmed from the revamp of the Iden type system in SeaQuery, in which &'static str now has slightly less overhead.

trait IdenStatic {
    fn as_str(&self) -> &'static str; // added static lifetime
}
trait EntityName {
    fn table_name(&self) -> &'static str; // added static lifetime
}

QueryBuilder is no longer object safe

Removed DbBackend::get_query_builder() because QueryBuilder is no longer object safe. This change improved query building performance by 5-10%.

impl DbBackend {
    // This is removed
  - fn get_query_builder(&self) -> Box<dyn QueryBuilder>;
}

Previously dyn SqlWriter is used everywhere.

fn prepare_table_create_statement(
    &self,
    create: &TableCreateStatement,
    sql: &mut dyn SqlWriter,
);

Now, it's a generic method:

fn prepare_table_create_statement(
    &self,
    create: &TableCreateStatement,
    sql: &mut impl SqlWriter, // note the impl
);

This change shouldn't impact most users because we have the following API:

pub trait StatementBuilder {
    fn build(&self, db_backend: &DbBackend) -> Statement;
}

// implemented for SelectStatement, InsertStatement, UpdateStatement, DeleteStatement, etc

Changed Database Connection

#2671 DatabaseConnection is changed from enum to struct. The original enum is moved into DatabaseConnection::inner. The new enum is named DatabaseConnectionType.

This allows DatabaseConnection to hold additional metadata.

// old
pub enum DatabaseConnection {
    SqlxMySqlPoolConnection(crate::SqlxMySqlPoolConnection),
    SqlxPostgresPoolConnection(crate::SqlxPostgresPoolConnection),
    SqlxSqlitePoolConnection(crate::SqlxSqlitePoolConnection),
    ..
}

// new
pub struct DatabaseConnection {
    pub inner: DatabaseConnectionType,
    ..
}

pub enum DatabaseConnectionType {
    SqlxMySqlPoolConnection(crate::SqlxMySqlPoolConnection),
    SqlxPostgresPoolConnection(crate::SqlxPostgresPoolConnection),
    SqlxSqlitePoolConnection(crate::SqlxSqlitePoolConnection),
    ..
}

Removed Derive Custom Column

#2667 Removed DeriveCustomColumn macro and default_as_str trait method. This was a legacy of the expanded entity format.

// This is no longer supported:
#[derive(Copy, Clone, Debug, EnumIter, DeriveCustomColumn)]
pub enum Column {
    Id,
    Name,
}

impl IdenStatic for Column {
    fn as_str(&self) -> &str {
        match self {
            Self::Name => "my_name",
            _ => self.default_as_str(),
        }
    }
}
// Do the following instead:
#[derive(Copy, Clone, Debug, EnumIter, DeriveColumn)]
pub enum Column {
    Id,
    #[sea_orm(column_name = "my_name")]
    Name,
}

Upgrades

  • tokio is now used in place of async-std in sea-orm-cli and examples as async-std has been deprecated.
  • Returning is now enabled for SQLite by default. SQLite introduced returning in 3.35 which was released in 2021, it should be the default by now.
  • #2596 Upgraded Rust Edition to 2024
  • Upgraded strum to 0.27

SQL Server Support

SQL Server for SeaORM offers the same SeaORM API for MSSQL. We ported all test cases and examples, complemented by MSSQL specific documentation. If you are building enterprise software, you can request commercial access. It is currently based on SeaORM 1.0, but we will offer free upgrade to existing users when SeaORM 2.0 is finalized.

🖥️ SeaORM Pro: Professional Admin Panel

SeaORM Pro is an admin panel solution allowing you to quickly and easily launch an admin panel for your application - frontend development skills not required, but certainly nice to have!

SeaORM Pro will be updated to support the latest features in SeaORM 2.0.

Features:

  • Full CRUD
  • Built on React + GraphQL
  • Built-in GraphQL resolver
  • Customize the UI with TOML config
  • Custom GraphQL endpoints (new in 2.0)
  • Role Based Access Control (new in 2.0)

More to come

SeaORM 2.0 is shaping up to be our most significant release yet - with a few breaking changes, plenty of enhancements, and a clear focus on developer experience. We'll dive into Role Based Access Control in the next post, so keep an eye out for the next update!

SeaORM 2.0 will launch alongside SeaQuery 1.0. If you make extensive use of SeaORM's underlying query builder, we recommend checking out our earlier blog post on SeaQuery 1.0 to get familiar with the changes.

SeaORM 2.0 has reached its release candidate phase. We'd love for you to try it out and help shape the final release by sharing your feedback.

🦀 Rustacean Sticker Pack

The Rustacean Sticker Pack is the perfect way to express your passion for Rust.
Our stickers are made with a premium water-resistant vinyl with a unique matte finish.

Sticker Pack Contents:

  • Logo of SeaQL projects: SeaQL, SeaORM, SeaQuery, Seaography
  • Mascots: Ferris the Crab x 3, Terres the Hermit Crab
  • The Rustacean wordmark

Support SeaQL and get a Sticker Pack!

Rustacean Sticker Pack by SeaQL

Browser Automation Protocols: CDP vs WebDriver Deep Dive

2025-12-03 19:00:07

A technical perspective on browser automation internals, protocol architectures, and when to use what.

Table of Contents

  1. The Two Protocols
  2. Architecture Comparison
  3. WebDriver Protocol (W3C)
  4. Chrome DevTools Protocol (CDP)
  5. Head-to-Head Comparison
  6. When to Use What
  7. Our CDP Implementation
  8. Production Examples
  9. Final Thoughts

The Two Protocols

Browser automation comes down to two fundamental approaches:

WebDriver - W3C standardized, cross-browser, high-level abstraction over HTTP REST.

CDP - Chrome's native debugging protocol, WebSocket-based, low-level access to browser internals.

Both solve browser automation. Neither is universally "better." Your use case dictates the choice.

Architecture Comparison

WebDriver Architecture

┌──────────────┐    HTTP/REST    ┌──────────────┐    Native    ┌─────────────┐
│    Client    │ ◄────────────► │    Driver    │ ◄──────────► │   Browser   │
│  (Selenium)  │   Port 4444    │  (chromedriver│   Protocol  │   (Chrome)  │
└──────────────┘                │   geckodriver)│              └─────────────┘
                                └──────────────┘

Three-tier model:

  1. Client library sends HTTP requests
  2. Driver binary translates to browser-native calls
  3. Browser executes and responds

The middleman tax: Every command pays HTTP overhead + driver process latency.

CDP Architecture

┌──────────────┐    WebSocket    ┌─────────────┐
│    Client    │ ◄────────────► │   Browser   │
│   (Direct)   │   Port 9222    │   (Chrome)  │
└──────────────┘                └─────────────┘

Two-tier model:

  1. Client connects directly to browser
  2. Persistent WebSocket, bidirectional streaming

No middleman. Direct protocol access. Events pushed in real-time.

WebDriver Protocol (W3C)

Overview

WebDriver is a W3C Recommendation since 2018. It defines a REST API for browser automation with focus on cross-browser compatibility.

Transport

HTTP/REST with JSON payloads:

POST /session HTTP/1.1
Content-Type: application/json

{
  "capabilities": {
    "browserName": "chrome",
    "browserVersion": "120"
  }
}

Session Lifecycle

# Create session
POST /session
→ {"sessionId": "abc123", "capabilities": {...}}

# All subsequent commands use session ID
POST /session/abc123/url
GET  /session/abc123/title
POST /session/abc123/element
DELETE /session/abc123

Core Endpoints

Endpoint Method Purpose
/session POST Create new session
/session/{id} DELETE End session
/session/{id}/url POST Navigate to URL
/session/{id}/url GET Get current URL
/session/{id}/title GET Get page title
/session/{id}/element POST Find element
/session/{id}/element/{eid}/click POST Click element
/session/{id}/element/{eid}/value POST Send keys
/session/{id}/screenshot GET Capture screenshot
/session/{id}/execute/sync POST Execute JS

Element Location Strategies

{
  "using": "css selector",
  "value": "button.submit"
}

Supported locators:

  • css selector
  • link text
  • partial link text
  • tag name
  • xpath

Example: Complete Flow

# 1. Create session
curl -X POST http://localhost:4444/session \
  -H "Content-Type: application/json" \
  -d '{"capabilities": {"browserName": "chrome"}}'

# Response: {"value": {"sessionId": "xyz789", ...}}

# 2. Navigate
curl -X POST http://localhost:4444/session/xyz789/url \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# 3. Find element
curl -X POST http://localhost:4444/session/xyz789/element \
  -H "Content-Type: application/json" \
  -d '{"using": "css selector", "value": "h1"}'

# Response: {"value": {"element-6066-...": "element-id-123"}}

# 4. Get text
curl http://localhost:4444/session/xyz789/element/element-id-123/text

# Response: {"value": "Example Domain"}

# 5. Screenshot
curl http://localhost:4444/session/xyz789/screenshot

# Response: {"value": "iVBORw0KGgo...base64..."}

# 6. Cleanup
curl -X DELETE http://localhost:4444/session/xyz789

Limitations

  1. No network interception - Can't inspect/modify HTTP traffic
  2. No console access - Can't capture console.log output
  3. No performance metrics - No access to rendering/memory data
  4. No real-time events - Polling only, no push notifications
  5. Driver dependency - Requires separate driver binary per browser
  6. Version coupling - Driver version must match browser version

Chrome DevTools Protocol (CDP)

Overview

CDP is Chrome's native debugging protocol. It's what DevTools uses internally. Direct access to 61 domains covering every browser capability.

Transport

Bidirectional WebSocket with JSON-RPC:

Client                                Browser
   │                                     │
   │──── {"id":1,"method":"Page.navigate", ───►
   │      "params":{"url":"..."}}        │
   │                                     │
   │◄─── {"id":1,"result":{"frameId":...}} ───
   │                                     │
   │◄─── {"method":"Page.loadEventFired", ────
   │      "params":{"timestamp":...}}    │
   │                                     │

Three message types:

  • Request: Client → Browser (has id + method)
  • Response: Browser → Client (has id + result/error)
  • Event: Browser → Client (has method only, no id)

Domain Organization

CDP organizes into domains. Each domain has methods and events.

Core domains:

Domain Methods Events Purpose
Page 25+ 15+ Navigation, lifecycle, screenshots
Runtime 20+ 10+ JS execution, console
DOM 30+ 10+ Document structure
Network 15+ 20+ HTTP traffic
Input 5+ 0 Mouse, keyboard, touch
Emulation 20+ 0 Device simulation
Target 15+ 5+ Tab/window management
Debugger 25+ 10+ JS debugging
Profiler 10+ 5+ CPU profiling
HeapProfiler 10+ 5+ Memory profiling

HTTP Discovery Endpoints

Before WebSocket, discover targets via HTTP:

# List all debuggable targets
curl http://localhost:9222/json/list
[
  {
    "id": "ABC123",
    "type": "page",
    "title": "New Tab",
    "url": "chrome://newtab/",
    "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/ABC123"
  }
]

# Browser version
curl http://localhost:9222/json/version
{
  "Browser": "Chrome/120.0.0.0",
  "Protocol-Version": "1.3",
  "webSocketDebuggerUrl": "ws://localhost:9222/devtools/browser/XYZ"
}

# Create new tab
curl http://localhost:9222/json/new?https://example.com

# Close tab
curl http://localhost:9222/json/close/ABC123

Protocol Examples

1. Navigation

// Enable Page domain first
 {"id": 1, "method": "Page.enable"}
 {"id": 1, "result": {}}

// Navigate
 {"id": 2, "method": "Page.navigate", "params": {"url": "https://example.com"}}
 {"id": 2, "result": {"frameId": "ABC", "loaderId": "XYZ"}}

// Events fired automatically
 {"method": "Page.frameStartedLoading", "params": {"frameId": "ABC"}}
 {"method": "Page.loadEventFired", "params": {"timestamp": 1234.56}}
 {"method": "Page.frameStoppedLoading", "params": {"frameId": "ABC"}}

2. JavaScript Evaluation

 {"id": 1, "method": "Runtime.enable"}
 {"id": 1, "result": {}}

 {"id": 2, "method": "Runtime.evaluate", "params": {
    "expression": "document.title",
    "returnByValue": true
  }}
 {"id": 2, "result": {
    "result": {"type": "string", "value": "Example Domain"}
  }}

// Complex evaluation
 {"id": 3, "method": "Runtime.evaluate", "params": {
    "expression": "(() => { return {width: window.innerWidth, height: window.innerHeight}; })()",
    "returnByValue": true
  }}
 {"id": 3, "result": {
    "result": {"type": "object", "value": {"width": 1920, "height": 1080}}
  }}

3. DOM Operations

 {"id": 1, "method": "DOM.enable"}
 {"id": 1, "result": {}}

 {"id": 2, "method": "DOM.getDocument", "params": {"depth": 0}}
 {"id": 2, "result": {
    "root": {"nodeId": 1, "nodeName": "#document", "childNodeCount": 2}
  }}

 {"id": 3, "method": "DOM.querySelector", "params": {"nodeId": 1, "selector": "h1"}}
 {"id": 3, "result": {"nodeId": 42}}

 {"id": 4, "method": "DOM.getOuterHTML", "params": {"nodeId": 42}}
 {"id": 4, "result": {"outerHTML": "<h1>Example Domain</h1>"}}

4. Network Interception

 {"id": 1, "method": "Network.enable"}
 {"id": 1, "result": {}}

// Events stream automatically
 {"method": "Network.requestWillBeSent", "params": {
    "requestId": "req-1",
    "request": {
      "url": "https://example.com/api/data",
      "method": "GET",
      "headers": {"Accept": "application/json"}
    },
    "timestamp": 1234.56,
    "type": "XHR"
  }}

 {"method": "Network.responseReceived", "params": {
    "requestId": "req-1",
    "response": {
      "status": 200,
      "statusText": "OK",
      "headers": {"content-type": "application/json"},
      "mimeType": "application/json"
    }
  }}

 {"method": "Network.loadingFinished", "params": {
    "requestId": "req-1",
    "encodedDataLength": 1234
  }}

// Get response body
 {"id": 2, "method": "Network.getResponseBody", "params": {"requestId": "req-1"}}
 {"id": 2, "result": {"body": "{\"data\": [...]}", "base64Encoded": false}}

5. Screenshots

 {"id": 1, "method": "Page.captureScreenshot", "params": {
    "format": "png",
    "quality": 100,
    "fromSurface": true
  }}
 {"id": 1, "result": {"data": "iVBORw0KGgoAAAANSUhEUgAAA..."}}

// Full page screenshot
 {"id": 2, "method": "Page.captureScreenshot", "params": {
    "format": "png",
    "captureBeyondViewport": true
  }}

// Specific region
 {"id": 3, "method": "Page.captureScreenshot", "params": {
    "format": "jpeg",
    "quality": 80,
    "clip": {"x": 0, "y": 0, "width": 800, "height": 600, "scale": 1}
  }}

6. Input Simulation

// Mouse click
 {"id": 1, "method": "Input.dispatchMouseEvent", "params": {
    "type": "mousePressed",
    "x": 100, "y": 200,
    "button": "left",
    "clickCount": 1
  }}
 {"id": 1, "result": {}}

 {"id": 2, "method": "Input.dispatchMouseEvent", "params": {
    "type": "mouseReleased",
    "x": 100, "y": 200,
    "button": "left",
    "clickCount": 1
  }}

// Type text
 {"id": 3, "method": "Input.insertText", "params": {"text": "Hello World"}}

// Key press
 {"id": 4, "method": "Input.dispatchKeyEvent", "params": {
    "type": "keyDown",
    "key": "Enter",
    "code": "Enter",
    "windowsVirtualKeyCode": 13
  }}
 {"id": 5, "method": "Input.dispatchKeyEvent", "params": {
    "type": "keyUp",
    "key": "Enter",
    "code": "Enter"
  }}

7. Console Capture

 {"id": 1, "method": "Runtime.enable"}
 {"id": 1, "result": {}}

// Console events stream automatically
 {"method": "Runtime.consoleAPICalled", "params": {
    "type": "log",
    "args": [{"type": "string", "value": "Hello from page"}],
    "timestamp": 1234567890.123
  }}

 {"method": "Runtime.consoleAPICalled", "params": {
    "type": "error",
    "args": [{"type": "string", "value": "Something went wrong"}],
    "stackTrace": {...}
  }}

8. Performance Metrics

 {"id": 1, "method": "Performance.enable"}
 {"id": 1, "result": {}}

 {"id": 2, "method": "Performance.getMetrics"}
 {"id": 2, "result": {
    "metrics": [
      {"name": "Timestamp", "value": 1234.56},
      {"name": "Documents", "value": 1},
      {"name": "Frames", "value": 1},
      {"name": "JSEventListeners", "value": 42},
      {"name": "Nodes", "value": 150},
      {"name": "LayoutCount", "value": 3},
      {"name": "RecalcStyleCount", "value": 5},
      {"name": "JSHeapUsedSize", "value": 10485760},
      {"name": "JSHeapTotalSize", "value": 16777216}
    ]
  }}

Head-to-Head Comparison

Protocol Level

Aspect WebDriver CDP
Specification W3C Standard Chrome Internal
Transport HTTP REST WebSocket
Connection Request/Response Persistent + Events
Latency Higher (HTTP per command) Lower (single WS)
Message Format JSON over HTTP JSON-RPC over WS

Architecture

Aspect WebDriver CDP
Components Client + Driver + Browser Client + Browser
Driver Required Yes (chromedriver, etc.) No
Version Coupling Driver ↔ Browser tight Protocol versioned
Port 4444 (driver) 9222 (browser)

Capabilities

Feature WebDriver CDP
Navigation
Element Interaction
JavaScript Execution
Screenshots
Cookies
Network Interception
Console Access
Performance Metrics
Real-time Events
DOM Debugging
CPU Profiling
Memory Profiling
Geolocation Emulation Limited
Device Emulation Limited
Request Blocking

Browser Support

Browser WebDriver CDP
Chrome
Edge ✅ (Chromium)
Firefox Partial
Safari
Opera ✅ (Chromium)

Ecosystem

Tool WebDriver CDP
Selenium Primary Via BiDi
Puppeteer Primary
Playwright Uses both Uses both
Cypress Primary

When to Use What

Use WebDriver When:

  1. Cross-browser testing - Need Safari, Firefox, Chrome uniformly
  2. Existing Selenium infrastructure - Large test suites already written
  3. Simple automation - Basic click, type, navigate workflows
  4. Compliance requirements - W3C standard may be mandated
  5. Team familiarity - Team knows Selenium well

Use CDP When:

  1. Chrome/Chromium only - Target browser is fixed
  2. Network interception - Mock APIs, block resources, modify requests
  3. Performance profiling - Need rendering metrics, memory analysis
  4. Console monitoring - Capture JS logs, errors, warnings
  5. Real-time events - React to page events as they happen
  6. Speed critical - Minimize automation overhead
  7. AI agents - Need granular control for autonomous browsing
  8. Advanced debugging - JS breakpoints, DOM inspection

Hybrid Approach (Playwright/Selenium 4)

Modern tools use both:

Playwright:
  - WebDriver for cross-browser compat
  - CDP for Chrome-specific features

Selenium 4 BiDi:
  - WebDriver base protocol
  - CDP bridge for advanced features

Our CDP Implementation

We built a production-ready Rust CDP client with two abstraction layers.

Project Structure

cdp-protocol/
├── src/
│   ├── lib.rs          # Public exports
│   ├── client.rs       # Low-level CDP client (WebSocket, routing)
│   ├── agent.rs        # High-level BrowserAgent (AI-friendly)
│   ├── types.rs        # Protocol message types
│   └── error.rs        # Error handling
├── examples/
│   ├── basic.rs        # Low-level usage
│   ├── agent.rs        # High-level AI agent
│   └── industrial.rs   # Parallel scraping
└── Cargo.toml

Dependencies

[dependencies]
tokio = { version = "1", features = ["full"] }
tokio-tungstenite = "0.21"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
reqwest = { version = "0.11", features = ["json"] }
base64 = "0.21"
tracing = "0.1"

Layer 1: CdpClient (Low-Level)

Direct protocol access with convenience wrappers.

use cdp_protocol::{CdpClient, Result};

#[tokio::main]
async fn main() -> Result<()> {
    // Discovery
    let version = CdpClient::get_version("localhost", 9222).await?;
    println!("Browser: {}", version.browser);

    let targets = CdpClient::list_targets("localhost", 9222).await?;
    for target in &targets {
        println!("  - {} [{}]: {}", target.target_type, target.id, target.title);
    }

    // Connect
    let client = CdpClient::connect_to_page("localhost", 9222).await?;

    // Enable domains
    client.enable_domain("Page").await?;
    client.enable_domain("Runtime").await?;
    client.enable_domain("DOM").await?;

    // Navigate
    let nav = client.navigate("https://example.com").await?;
    println!("Frame ID: {}", nav.frame_id);

    tokio::time::sleep(std::time::Duration::from_secs(2)).await;

    // JavaScript
    let title: String = client.eval("document.title").await?;
    println!("Title: {}", title);

    let result = client.evaluate("1 + 2 * 3").await?;
    println!("Math: {:?}", result.result.value);

    // DOM
    let doc = client.get_document().await?;
    let h1_id = client.query_selector(doc.node_id, "h1").await?;
    if h1_id > 0 {
        let html = client.get_outer_html(h1_id).await?;
        println!("H1: {}", html);
    }

    // Screenshot
    client.screenshot_to_file("example.png").await?;

    // Cookies
    let cookies = client.get_cookies().await?;
    println!("Cookies: {}", cookies.len());

    Ok(())
}

Layer 2: BrowserAgent (High-Level)

AI-friendly interface with JSON action dispatch.

use cdp_protocol::{BrowserAgent, BrowserAction, ActionResult};

#[tokio::main]
async fn main() -> Result<()> {
    let agent = BrowserAgent::connect("localhost", 9222).await?;

    // Programmatic
    agent.execute(BrowserAction::Navigate {
        url: "https://example.com".to_string(),
    }).await;

    agent.execute(BrowserAction::GetTitle).await;

    // JSON (LLM tool calls)
    agent.execute_json(r#"{"action": "navigate", "url": "https://rust-lang.org"}"#).await;
    agent.execute_json(r#"{"action": "wait", "ms": 2000}"#).await;
    agent.execute_json(r#"{"action": "screenshot", "path": "rust.png"}"#).await;

    Ok(())
}

Action Builder

Fluent API for chaining:

use cdp_protocol::ActionBuilder;

let actions = ActionBuilder::new()
    .navigate("https://www.google.com")
    .wait(1500)
    .fill("input[name='q']", "Rust programming")
    .press_key("Enter")
    .wait(2000)
    .screenshot(Some("search.png"))
    .build();

let results = agent.execute_many(actions).await;

Supported Actions

pub enum BrowserAction {
    // Navigation
    Navigate { url: String },
    GoBack,
    GoForward,
    Reload,

    // Interaction
    Click { selector: Option<String>, x: Option<f64>, y: Option<f64> },
    Type { text: String, selector: Option<String> },
    Fill { selector: String, value: String },
    Submit { selector: Option<String> },
    PressKey { key: String },

    // Inspection
    GetTitle,
    GetUrl,
    GetText,
    GetContent { selector: Option<String> },
    GetLinks,
    GetAttributes { selector: String },
    Exists { selector: String },

    // Capture
    Screenshot { path: Option<String> },

    // Scripting
    Evaluate { expression: String },

    // Waiting
    Wait { ms: u64 },
    WaitForSelector { selector: String, timeout_ms: u64 },

    // Layout
    Scroll { x: f64, y: f64 },
    SetViewport { width: i32, height: i32, mobile: bool },

    // Metrics
    GetMetrics,
}

Production Examples

Form Automation

let search_actions = vec![
    BrowserAction::Navigate {
        url: "https://duckduckgo.com".to_string(),
    },
    BrowserAction::Wait { ms: 1500 },
    BrowserAction::Fill {
        selector: "input[name='q']".to_string(),
        value: "Rust programming language".to_string(),
    },
    BrowserAction::PressKey {
        key: "Enter".to_string(),
    },
    BrowserAction::Wait { ms: 2000 },
    BrowserAction::Screenshot {
        path: Some("search_results.png".to_string()),
    },
    BrowserAction::GetTitle,
];

for action in search_actions {
    let result = agent.execute(action).await;
    if !result.is_success() {
        println!("Failed: {:?}", result);
        break;
    }
}

Data Extraction

let result = agent.execute(BrowserAction::Evaluate {
    expression: r#"
        (() => {
            return {
                viewport: {
                    width: window.innerWidth,
                    height: window.innerHeight
                },
                userAgent: navigator.userAgent,
                language: navigator.language,
                cookiesEnabled: navigator.cookieEnabled,
                platform: navigator.platform
            };
        })()
    "#.to_string(),
}).await;

Industrial Scraping (50 Pages Parallel)

use cdp_protocol::{CdpClient, Result};
use std::sync::Arc;
use std::time::Instant;
use tokio::sync::Semaphore;

const NUM_PAGES: usize = 50;
const MAX_CONCURRENT: usize = 10;

const URLS: &[&str] = &[
    "https://www.rust-lang.org",
    "https://www.google.com",
    "https://github.com",
    "https://stackoverflow.com",
    "https://news.ycombinator.com",
    "https://www.wikipedia.org",
    "https://www.reddit.com",
    "https://docs.rs",
    "https://crates.io",
    "https://www.mozilla.org",
];

#[tokio::main]
async fn main() -> Result<()> {
    std::fs::create_dir_all("screenshots").ok();

    let start = Instant::now();
    let semaphore = Arc::new(Semaphore::new(MAX_CONCURRENT));

    let mut handles = Vec::with_capacity(NUM_PAGES);

    for i in 0..NUM_PAGES {
        let url = URLS[i % URLS.len()].to_string();
        let sem = semaphore.clone();

        let handle = tokio::spawn(async move {
            let _permit = sem.acquire().await.unwrap();
            process_page(i, &url).await
        });

        handles.push(handle);
    }

    let mut success = 0;
    let mut failed = 0;

    for (i, handle) in handles.into_iter().enumerate() {
        match handle.await {
            Ok(Ok((title, elapsed))) => {
                println!("[{:3}] ✓ {} ({:.1}s)", i, title, elapsed);
                success += 1;
            }
            Ok(Err(e)) => {
                println!("[{:3}] ✗ Error: {}", i, e);
                failed += 1;
            }
            Err(e) => {
                println!("[{:3}] ✗ Panic: {}", i, e);
                failed += 1;
            }
        }
    }

    let total = start.elapsed();
    println!("\nTotal: {:.2}s | Success: {} | Failed: {} | {:.2} pages/sec",
        total.as_secs_f64(), success, failed,
        NUM_PAGES as f64 / total.as_secs_f64());

    Ok(())
}

async fn process_page(id: usize, url: &str) -> Result<(String, f64)> {
    let start = Instant::now();

    let target = CdpClient::create_tab("localhost", 9222, Some(url)).await?;
    let ws_url = target.web_socket_debugger_url
        .ok_or_else(|| cdp_protocol::CdpError::InvalidUrl("No WS URL".into()))?;

    let client = CdpClient::connect(&ws_url).await?;
    client.enable_domain("Page").await?;
    client.enable_domain("Runtime").await?;

    tokio::time::sleep(std::time::Duration::from_millis(2000)).await;

    let title: String = client.eval("document.title").await
        .unwrap_or_else(|_| "Unknown".to_string());

    client.screenshot_to_file(&format!("screenshots/page_{:03}.png", id)).await?;

    Ok((title, start.elapsed().as_secs_f64()))
}

Output:

=== Industrial Scraping Demo ===
Pages to process: 50
Max concurrent: 10

[  0] ✓ Rust Programming Language (3.2s)
[  1] ✓ Google (2.8s)
...
[ 49] ✓ Hacker News (2.9s)

Total: 18.42s | Success: 50 | Failed: 0 | 2.71 pages/sec

Final Thoughts

Protocol Selection Matrix

Requirement Recommendation
Cross-browser testing WebDriver
Chrome-only, max performance CDP
Network mocking CDP
AI agent automation CDP
Existing Selenium codebase WebDriver (+ BiDi for CDP features)
Console/log capture CDP
Performance profiling CDP
Simple E2E tests Either works

The Future

WebDriver BiDi is bridging the gap - adding CDP-like capabilities to WebDriver. Selenium 4 already supports it. Eventually, you'll get the best of both worlds through a unified spec.

Until then:

  • WebDriver for cross-browser standardization
  • CDP for Chrome power-user features

Resources

Running Microsoft's Phi-3 on CPU with Rust &amp; Candle

2025-12-03 18:56:45

Python is currently the best tool for training machine learning models (AI), with many tools available in the Python ecosystem - like PyTorch and Hugging Face Transformers - it has never been easier to create a proof of concept for an AI model. We are big fans of Python because of how easy and flexible it is.

However, let's take a look at the deployment aspect of using the model.

When it comes to inference (running a trained model) on platforms where you need to run a model in a production environment or on a device that is not a computer, Python starts to show its disadvantages.

If you've tried deploying a PyTorch application before, then you already know the pain points involved in the process. You have to create a multi-gigabyte Docker container image just to run a simple Python script. The cold-starting time associated with a large interpreter is very slow, and the memory required to run these types of models makes it impossible to run on most consumer-level CPUs or edge devices.

Is it possible to benefit from all of the advantages of modern large language models (LLMs) and deploy them with significantly less overhead?

This is where Rust and Candle come into play.

Candle is a minimalistic machine learning (ML) framework created by Hugging Face. You can utilize Candle and run state-of-the-art AI models without having to depend on Python. When we combine Rust's ability to provide memory safety and speed with Microsoft Phi-3 (an advanced small language model), we can reach an entirely different level of performance.

In this guide, I will show you how to:

Ditch the heavy PyTorch dependency.

Load a quantized Phi-3 model directly in Rust.

Build a standalone, lightweight CLI tool that runs blazing-fast inference on a standard CPU.

No GPU required. No 5GB Docker images. Just pure, high-performance Rust.

Let's dive in.

*Step 1: Setting Up the Project
*

First, let's create a new Rust project. Open your terminal and run:
cargo new rust-phi3-cpu
cd rust-phi3-cpu

Next, we need to add the Candle stack to our Cargo.toml. Since we are focusing on CPU inference, we will use the quantized features to keep the model lightweight and fast.

Open Cargo.toml and add the following dependencies:

[package]
name = "rust-phi3"
version = "0.1.0"
edition = "2021"

[dependencies]
anyhow = "1.0"
tokenizers = "0.19.1"
clap = { version = "4.4", features = ["derive"] }

candle-core = { git = "https://github.com/huggingface/candle.git", branch = "main" }
candle-transformers = { git = "https://github.com/huggingface/candle.git", branch = "main" }
candle-nn = { git = "https://github.com/huggingface/candle.git", branch = "main" }

We are using candle-transformers which has built-in support for GGUF (quantized) models. This is the secret sauce for running heavy models on a CPU efficiently.

*Step 2: The Implementation
*

Now, open src/main.rs. We are going to build a simple CLI that accepts a prompt and generates text.

The logic is straightforward:

Load the Model: Read the .gguf file (Phi-3 Mini).

Tokenize: Convert the user's prompt into numbers (tokens).

Inference Loop: Feed tokens into the model one by one to generate the next word.

Here is the main.rs

use anyhow::{Error as E, Result};
use clap::Parser;
use candle_transformers::models::quantized_phi3 as model; // Using Phi-3 specific module
use candle_core::{Tensor, Device};
use candle_core::quantized::gguf_file;
use tokenizers::Tokenizer;
use std::io::Write;

#[derive(Parser, Debug)]
#[command(author, version, about, long_about = None)]
struct Args {
    /// The prompt to run inference with
    #[arg(short, long, default_value = "Physics is fun. Explain quantum physics to a 5-year-old in simple words:")]
    prompt: String,

    /// Path to the GGUF model file
    #[arg(long, default_value = "Phi-3-mini-4k-instruct-q4.gguf")]
    model_path: String,
}

fn main() -> Result<()> {
    let args = Args::parse();

    println!("Loading model from: {}", args.model_path);

    // 1. Setup Device (CPU)
    let device = Device::Cpu;

    // 2. Load the GGUF Model
    let mut file = std::fs::File::open(&args.model_path)
        .map_err(|_| E::msg(format!("Could not find model file at {}. Did you download it?", args.model_path)))?;

    // Read the GGUF content
    let content = gguf_file::Content::read(&mut file)?;

    // Load model weights (Flash Attention set to false for CPU)
    let mut model = model::ModelWeights::from_gguf(false, content, &mut file, &device)?;

    // 3. Load Tokenizer
    println!("Loading tokenizer...");
    let tokenizer = Tokenizer::from_file("tokenizer.json").map_err(E::msg)?;

    // 4. Encode Prompt
    let tokens = tokenizer.encode(args.prompt, true).map_err(E::msg)?;
    let prompt_tokens = tokens.get_ids();
    let mut all_tokens = prompt_tokens.to_vec();

    // 5. Inference Loop
    println!("Generating response...\n");
    let mut to_generate = 100; // Generate 100 tokens max
    let mut logits_processor = candle_transformers::generation::LogitsProcessor::new(299792458, None, None);

    print!("Response: ");
    std::io::stdout().flush()?;

    let mut next_token = *prompt_tokens.last().unwrap();

    for _ in 0..to_generate {
        let input = Tensor::new(&[next_token], &device)?.unsqueeze(0)?;
        let logits = model.forward(&input, all_tokens.len())?;
        let logits = logits.squeeze(0)?;

        next_token = logits_processor.sample(&logits)?;
        all_tokens.push(next_token);

        if let Some(t) = tokenizer.decode(&[next_token], true).ok() {
            print!("{}", t);
            std::io::stdout().flush()?;
        }
    }

    println!("\n\nDone!");
    Ok(())
}

*Step 3: Getting the Model Weights
*

Before we run this, we need the actual brain of the AI. Since we are optimizing for CPU, we will use the GGUF format (Quantized).

You can download the Phi-3-Mini-4k-Instruct (Quantized) from Hugging Face. Look for the q4_k_m.gguf version (approx. 2.3 GB).

Model: Phi-3-mini-4k-instruct-q4.gguf

Tokenizer: tokenizer.json (from the official repo)

*Step 4: Running the Demo
*

This is the moment of truth.

Before running, make sure to compile in release mode. Rust’s debug builds are optimized for debugging, not speed. For AI inference, the difference is massive (often 10x-100x slower in debug mode).

Run the following command in your terminal:
cargo run --release -- --model-path "Phi-3-mini-4k-instruct-q4.gguf" --prompt "whatever you want:"

What happens next? You won't see a long delay while a Python interpreter spins up or massive libraries import. Instead, within milliseconds, you’ll see the tokens streaming to your console.

On my local machine (Surface / intel 8100Y), I am getting a smooth stream of tokens, fully generated on the CPU. No heavy lifting required.

The model weights file size is the same for both, but the deployment artifact size (the code + runtime) is drastically smaller in Rust.

This makes Rust an ideal candidate for Edge AI, IoT devices, or Serverless functions where startup time and memory footprint are critical costs.

Does this mean Python is dead? Absolutely not.

Python remains the best ecosystem for training, experimenting, and data science. However, when it's time to take that model and ship it to production—especially in resource-constrained environments—Rust is a superpower.

By using frameworks like Candle, we can run modern LLMs like Phi-3 on standard CPUs with incredible efficiency. We get the safety and speed of Rust without sacrificing the capabilities of modern AI.

Enterprise Interop Made Easy: WASM Compiled Libraries for Java Developers

2025-12-03 18:56:27

WebAssembly (WASM) is rapidly evolving into the “universal virtual machine”. Originally designed to let browsers run non‑JavaScript code, WASM has grown beyond its initial scope. Today, major toolchains such as C/C++, Rust, Go, Zig, and even higher‑level languages like Python and C# can compile to WASM, making it a powerful cross‑platform runtime.

WASM vs JVM: Two Stack Machines

Both the WASM Virtual Machine and the Java Virtual Machine (JVM) are stack‑based machines. This architectural similarity means that compiling from WASM bytecode to JVM bytecode is not only possible but practical. Projects like Chicory demonstrate this by translating WASM modules into native JVM classes.

This approach solves a long‑standing problem:

How do we reuse existing C/C++ libraries in modern toolchains without rewriting everything?

Traditionally, developers faced two choices:

  • The Java Approach: Rewrite everything in Java for portability and security.
  • The Python Approach: Wrap native libraries via shared libraries/DLLs for quick integration.

Each has trade‑offs:

  • Java: Interoperability and sandboxed execution (no risky system calls).
  • Python: Fast integration with wrappers.
  • Python downside: Requires platform‑specific DLLs/shared libraries.
  • Java downside: Costly rewrites of large codebases.

WASM as the Bridge

With WASM, we no longer need to choose between rewriting or wrapping. Instead:

  1. Compile existing C/C++ code to WASM.
  2. Translate WASM into JVM classes using Chicory.
  3. Expose the result as a native Java library.

This yields a pure JVM artifact — no external DLLs, no native dependencies — while still leveraging proven C/C++ libraries.

Proof of Concept: DOCX to PDF in Java

As a demonstration, we compiled a non‑trivial C/C++ library capable of rendering Microsoft Word .docx files to PDF. Using Chicory, we produced a native Java library published via Maven: WordSDK Java HelloWorld

Here’s how it integrates with e.g. Docx4J to produce a PDF:

WordSDK.Worker api=WordSDK.createWorker(options);        
// Create an import stream for feeding into WordSDK
OutputStream importStream=api.createImportStream();  
// feed the DOCX4J document into WordSDK
Docx4J.save(wordMLPackage, importStream, Docx4J.FLAG_NONE); 
// generate an in-memory PDF for further processing...    
final byte[] pdf=api.exportPDF(); 

The result: a native JVM library that consumes WASM‑compiled C/C++ code, seamlessly integrated into Java enterprise workflows. Go to WordSDK Java HelloWorld to give it a try!

Performance

One of the most common questions when introducing a new runtime layer is: “But how fast is it?”

The good news: performance is excellent. Running WASM modules inside the JVM via Chicory is comparable to executing WASM in environments like Node.js. As with any virtual machine, there is a small overhead compared to native execution, but this is the same trade‑off developers already accept when running code in the JVM or other managed runtimes.

Why This Matters

  • Enterprise relevance: Many systems still rely heavily on Java.
  • Security: No native DLLs or system calls — everything runs inside the JVM sandbox.
  • Portability: WASM makes libraries cross‑platform by design.
  • Future‑proofing: WASM is becoming the universal compilation target, much like the JVM did for Java.

Special Thanks

Huge credit to the Chicory project. Its ability to compile non‑trivial WASM modules into JVM classes is a game‑changer for developers bridging ecosystems.

We are curious to hear how this approach could fit into your enterprise workflows — and if you’ve experimented with this approach yourself, we’d love to learn from your experiences too.

Understanding Dependency Injection Lifetimes: Singleton, Scoped, and Transient

2025-12-03 18:55:07

Hello there!👋🧔‍♂️ Today, we're exploring the three main service lifetimes: singleton, scoped, and transient.

Understanding dependency injection (DI) lifecycle management is crucial for building reliable applications. Get it wrong, and you might end up with memory leaks or shared state bugs. Get it right, and your code will be cleaner and easier to maintain!

Overview

When you register a service in a dependency injection container, you need to specify its *lifetime, how long the container should keep an instance alive. The three main lifetimes are:

  1. Singleton - One instance for the entire application lifetime
  2. Scoped - One instance per scope (typically per HTTP request in web apps)
  3. Transient - A new instance every time it's requested

1. Singleton

What is Singleton?

Singleton is like having one shared tool that everyone uses. There's only one instance, and it lives for the entire lifetime of your application.

Key Characteristics:

  • One instance for the entire application
  • Shared across all requests
  • Created once when first requested
  • Disposed when the application shuts down

When to Use Singleton

Use singleton for:

  • Stateless services - Services that don't hold instance-specific data
  • Expensive to create - Services costly to initialize (caches, HTTP clients)
  • Shared resources - Configuration, logging, caching

Example

// Register as singleton
services.AddSingleton<ICacheService, CacheService>();
services.AddSingleton<ILogger, Logger>();

// Example: Cache service
public class CacheService : ICacheService
{
    private readonly Dictionary<string, object> _cache = new();
    private readonly object _lock = new object();

    public void Set<T>(string key, T value)
    {
        lock (_lock) { _cache[key] = value; }
    }

    public T Get<T>(string key)
    {
        lock (_lock)
        {
            return _cache.TryGetValue(key, out var value) ? (T)value : default(T);
        }
    }
}

Common Pitfalls

⚠️ Watch out for:

  1. Shared State Issues
   // BAD: Singleton with mutable state
   public class BadService
   {
       public string CurrentUserId { get; set; } // Shared across all requests!
   }
  1. Thread Safety - Ensure thread safety if shared state is needed
  2. Disposal - Implement IDisposable if holding resources

2. Scoped

What is Scoped?

Scoped is like having a personal workspace for each task. In web applications, each HTTP request gets its own scope, and all scoped services share the same instance within that scope.

Key Characteristics:

  • One instance per scope (typically per HTTP request)
  • Shared within the same scope, different across scopes
  • Created when the scope begins
  • Disposed when the scope ends

When to Use Scoped

Use scoped for:

  • Database contexts - Entity Framework DbContext
  • Request-specific services - Services that need state during a request
  • Unit of Work patterns - Services coordinating multiple operations

Example

// Register as scoped
services.AddScoped<IDbContext, ApplicationDbContext>();
services.AddScoped<IOrderService, OrderService>();

// Example: Database context
public class ApplicationDbContext : DbContext
{
    public DbSet<Order> Orders { get; set; }
}

public class OrderService
{
    private readonly ApplicationDbContext _context;

    public OrderService(ApplicationDbContext context) => 
        _context = context; // Same instance throughout the request

    public async Task<Order> CreateOrder(Order order)
    {
        _context.Orders.Add(order);
        await _context.SaveChangesAsync();
        return order;
    }
}

Common Pitfalls

⚠️ Watch out for:

  1. Using Scoped Services in Singleton
   // BAD: Singleton depending on scoped service
   public class BadSingleton
   {
       private readonly IScopedService _scoped; // Error!
   }

   // GOOD: Use IServiceProvider
   public class GoodSingleton
   {
       private readonly IServiceProvider _serviceProvider;

       public GoodSingleton(IServiceProvider serviceProvider) => 
           _serviceProvider = serviceProvider;

       public void DoWork()
       {
           using (var scope = _serviceProvider.CreateScope())
           {
               var scoped = scope.ServiceProvider.GetRequiredService<IScopedService>();
               scoped.DoWork();
           }
       }
   }
  1. Capturing Scoped Services - Don't capture scoped services in background tasks without proper scope management

3. Transient

What is Transient?

Transient is like getting a fresh tool every time you need one. Each time you request a transient service, you get a brand new instance. Important: Transient instances are NOT kept alive for the app lifetime, they're created on-demand and disposed when they go out of scope (garbage collected).

Key Characteristics:

  • New instance every time it's requested
  • No sharing between consumers
  • Created on demand
  • Disposed when out of scope (garbage collected, NOT kept for app lifetime)
  • Shortest lifetime of all three options

When to Use Transient

Use transient for:

  • Lightweight services - Services that are cheap to create
  • Stateless operations - Services that don't maintain state
  • Services that shouldn't be shared - When sharing would cause issues

Example

// Register as transient
services.AddTransient<IEmailService, EmailService>();
services.AddTransient<IValidator<Order>, OrderValidator>();

// Example: Validator service
public class OrderValidator : IValidator<Order>
{
    public ValidationResult Validate(Order order)
    {
        var result = new ValidationResult();

        if (order.Items == null || order.Items.Count == 0)
            result.AddError("Order must have at least one item");

        return result;
    }
}

Common Pitfalls

⚠️ Watch out for:

  1. Performance Issues - Don't use transient for expensive-to-create services
  2. Memory Leaks - Don't hold static references in transient services
  3. Unnecessary Creation - Don't use transient when singleton would work

Comparison Table

Aspect Singleton Scoped Transient
Instance Count One for entire app One per scope New every time
Lifetime Application lifetime Scope lifetime (per request) Shortest - disposed when out of scope
Memory Usage Low (shared) Medium High (many instances created)
Thread Safety Must be thread-safe Usually not needed Usually not needed
Disposal When app shuts down When scope ends When garbage collected
Use Case Stateless, expensive Request-specific Lightweight, stateless

Common Scenarios

Web Application

public void ConfigureServices(IServiceCollection services)
{
    // Singleton: Shared across all requests
    services.AddSingleton<ICacheService, CacheService>();

    // Scoped: One per HTTP request
    services.AddScoped<IDbContext, ApplicationDbContext>();

    // Transient: New instance each time
    services.AddTransient<IEmailService, EmailService>();
}

Background Service

public class BackgroundWorker : BackgroundService
{
    private readonly IServiceProvider _serviceProvider;

    public BackgroundWorker(IServiceProvider serviceProvider) => 
        _serviceProvider = serviceProvider;

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
    {
        while (!stoppingToken.IsCancellationRequested)
        {
            // Create a scope for each iteration
            using (var scope = _serviceProvider.CreateScope())
            {
                var scopedService = scope.ServiceProvider.GetRequiredService<IScopedService>();
                await scopedService.DoWorkAsync();
            }

            await Task.Delay(1000, stoppingToken);
        }
    }
}

Dependency Rules

A service can only depend on services with equal or longer lifetimes.

  • ✅ Singleton can depend on: Singleton
  • ✅ Scoped can depend on: Singleton, Scoped
  • ✅ Transient can depend on: Singleton, Scoped, Transient

Common Mistake:

// BAD: Singleton depending on scoped service
public class SingletonService
{
    private readonly IScopedService _scoped; // Error!
}

// GOOD: Use IServiceProvider to create scope when needed
public class SingletonService
{
    private readonly IServiceProvider _serviceProvider;

    public SingletonService(IServiceProvider serviceProvider) => 
        _serviceProvider = serviceProvider;

    public void DoWork()
    {
        using (var scope = _serviceProvider.CreateScope())
        {
            var scoped = scope.ServiceProvider.GetRequiredService<IScopedService>();
            scoped.DoWork();
        }
    }
}

Best Practices

1. Choose the Right Lifetime

  • Singleton: Stateless, expensive, shared resources
  • Scoped: Request-specific, database contexts
  • Transient: Lightweight, stateless, fresh instance needed

2. Follow Dependency Rules

  • Never have shorter lifetimes depend on longer ones
  • Use IServiceProvider when you need to break the rules
  • Understand the implications of each choice

3. Handle Disposal Properly

// Implement IDisposable for services that hold resources
public class ResourceService : IDisposable
{
    private bool _disposed = false;

    public void Dispose()
    {
        if (_disposed) return;

        // Clean up resources
        _disposed = true;
    }
}

Conclusion

Understanding dependency injection lifetimes is crucial for building robust applications:

Singleton - One instance for the entire application:

  • Use for stateless, expensive-to-create services
  • Ensure thread safety
  • Great for caching, logging, configuration

Scoped - One instance per scope (request):

  • Use for request-specific services
  • Perfect for database contexts
  • Automatically managed in ASP.NET Core

Transient - New instance every time:

  • Use for lightweight, stateless services
  • When you need a fresh instance each time
  • NOT kept alive for app lifetime - instances are created and disposed quickly
  • Be mindful of performance implications (many instances created)

The Golden Rules:

  • Choose the shortest lifetime that meets your needs
  • Follow dependency lifetime rules (shorter can't depend on longer)
  • Implement IDisposable for resource cleanup
  • Test and monitor your choices

Remember, there's no one-size-fits-all answer. The right choice depends on your specific use case and performance requirements. Start with the most restrictive lifetime that works, and adjust as needed.

You've got this! 💪 Happy coding!

Additional Resources