MoreRSS

site iconHackerNoonModify

We are an open and international community of 45,000+ contributing writers publishing stories and expertise for 4+ million curious and insightful monthly readers.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of HackerNoon

GoBolt 押注其供应链能够应对 2026 年的下一次重大冲击

2026-04-28 16:15:09

Geopolitical tensions, rerouted shipping lanes, and severe weather events have exposed over the last few years just how fragile global logistics networks really are. Rerouting ships around Africa adds about two weeks to transit time, according to a recent UAE Ministry of Transport briefing. And the disruptions aren't slowing down.

So, as the next shock looms, companies are searching for answers they don't yet have. This piece will examine how building a sustainable supply chain offers a true operational defense and explain why GoBolt is positioning itself at the center of that shift.

The "Just-In-Time" Illusion Is Over

For decades, global distribution leaned hard into cost efficiency at the expense of resilience. Razor-thin margins, geographically concentrated sourcing, and quarterly profit targets drove every decision. It worked, until it didn't.

Extreme rainfall and severe storms now force major ports to halt cargo handling entirely. A 2023 Drewry logistics report found that weather-related port delays added nearly four days to average transit times. Relying on a single chokepoint or a fossil-fuel-dependent network isn't just risky anymore; it's a guaranteed liability. Supply chain professionals are waking up to the need for advanced digital tools and localized distribution to bypass bottlenecks.

Why Going Green Actually Builds Resilience

There's a stubborn misconception that greening a supply chain is just a PR exercise. But the mechanisms required to decarbonize an operation are the same ones that make it structurally stronger.

A 2024 Gartner survey found that 67 percent of logistics leaders see new ESG regulations as the most influential drivers of change. Transitioning to a more ecological model forces companies to map their entire logistics network with precision. That kind of visibility lets managers spot weak links and reroute shipments dynamically during crises.

And there's a direct financial angle too. Reducing dependency on fossil fuels lowers exposure to global energy shocks. Localizing fulfillment and minimizing waste protects the bottom line when unpredictable price spikes hit.

Hedging Against Fuel Volatility with EVs

Legacy logistics still runs on diesel. That means daily operations rise and fall with international energy markets. When geopolitical conflicts threaten oil supplies, transportation costs can skyrocket overnight.

Transitioning to electric transportation networks acts as a financial hedge against that volatility. It's not just theory, either. Insights from LRQA's EiQ platform suggest companies with robust sustainability programs achieve up to 10 percent higher market valuations. Investing in clean tech untethers a business entirely from the chaotic swings of the petroleum market.

GoBolt: Building the Modern Logistics Stack

GoBolt has positioned itself as one of the companies rethinking distribution from the ground up. The company deploys electric vehicle fleets at scale, completing over 40 percent of its last-mile shipments with EVs. For merchant partners, that translates to meaningful insulation from fuel price spikes that competitors still absorb.

GoBolt complements its electric vehicle fleets with advanced emissions tracking and proprietary route optimization algorithms. According to GoBolt leadership, this unified technology stack enables businesses to meet environmental targets while minimizing exposure to supply disruptions. By integrating real-time data and efficient routing, GoBolt reduces Scope 3 transportation emissions and preserves delivery speed.

Localized Networks and Digital Traceability

Physical assets solve only half the problem. True stability requires serious digital visibility.

GoBolt unifies its technical approach by combining strategically located fulfillment centers, localized last-mile networks, real-time tracking, and customer-facing delivery visibility. According to a 2025 benchmark from Supply Chain Dive, found this localized fulfillment model can reduce last-mile transit times by up to 35 percent, while also providing transparent order status to both merchants and shoppers.

When severe weather hits, that rich data environment allows rapid resource reallocation. It's logistics treated as a transparent science, turning GoBolt into something closer to a risk management partner than a traditional carrier.

Here's a closer look at the key capabilities in play:

  • Real-time traceability: Immediate shipment visibility prevents blind spots during weather delays.
  • Route optimization: Algorithmic planning cuts unnecessary mileage and limits fuel market exposure.
  • Localized distribution: Multi-node fulfillment bypasses congested centralized hubs to keep things moving.

| Logistics Model | Vulnerability Level | Primary Defense Mechanism | |----|----|----| | Traditional just-in-time | High | Reactive cost-cutting | | GoBolt sustainable stack | Low | Proactive route optimization + EV fleets |

Taken together, these changes signal the end of hyper-globalized, hyper-fragile logistics. Extreme weather, energy turmoil, and trade shifts guarantee further disruption. A 2025 World Economic Forum analysis found that sustainable supply chains suffer 20 percent fewer disruption days annually.

That's not a marginal improvement; it's a competitive edge. Companies partnering with providers like GoBolt gain access to electric fleets and localized digital traceability that didn't exist at this scale a few years ago. The greenest distribution network, it turns out, may also be the most stable. And in 2026's unpredictable landscape, stability is the whole game.

 

:::tip This story was distributed as a release by Jon Stojan under HackerNoon’s Business Blogging Program.

:::

\

乔利·沙阿与可持续固件设计的演变

2026-04-28 16:00:47

Jolly Shah, an embedded software leader at FAANG, is making notable contributions to the field of scalable, power-efficient firmware and sustainable computing. With an engineering career spanning over 18 years, she has played key roles at some big semiconductor companies and other industry-defining organizations, specializing in the architecture and development of low-power firmware for embedded devices, multicore SoCs, and data center storage controllers. 

Jolly’s expertise centers on building resilient firmware systems that manage performance, energy consumption, and reliability behind the scenes—enabling billions of digital transactions to occur seamlessly and efficiently across the globe. In today’s rapidly expanding digital landscape, the hidden infrastructure powering cloud services and intelligent devices must operate with unwavering reliability and minimal energy waste. 

As demands on global computing infrastructure intensify, the intersection of sustainability and firmware design has moved to the forefront of engineering. Jolly’s approach positions sustainability as an architectural imperative, demonstrating how firmware engineers are uniquely poised to address environmental impact while advancing efficiency and performance standards at scale.

The origins of sustainable firmware thinking

Jolly first recognized the critical link between firmware engineering and sustainability during her tenure at Audience Inc., where she developed always-on voice DSPs that required extreme resource efficiency. She notes, "The connection first became clear to me during my time at Audience Inc. Working on always-on voice DSPs, I realized that firmware wasn't just about functionality; it was about resource stewardship."

Her mindful approach, which she calls a "milliwatt mindset," deepened as she moved into roles, expanding her perspective from battery-operated devices to datacenter-scale storage systems. "I had to write code that allowed devices to listen continuously without draining the battery, which meant every clock cycle and power state transition mattered. I later applied this 'milliwatt mindset' to 'megawatt problems', ensuring that massive datacenter storage systems run as efficiently as possible." 

The awareness reflects a broader trend in technology, as platform management units and power management controllers in multicore SoCs are increasingly tasked with advanced, fine-grained energy management—an evolution described in technical analyses of platform management firmware for modern semiconductors.

This early encounter with the necessity of balancing function and efficiency laid the groundwork for a career defined by intelligent partitioning of resources and relentless optimization of underlying system software.

Defining success where efficiency is invisible

Measuring achievement in firmware engineering is a nuanced task, especially when the best results are often unseen by end users. Jolly explains, "For me, success is measured by the stability and efficiency of the system. In my work on low-power DSPs, success was quantitative: implementing dynamic clock scaling to shave off milliwatts so a device could listen longer."

She also frames success in qualitative terms, emphasizing prevention: "Success is often qualitative and preventative: it's the crash that didn't happen because of a fix I implemented, or the security vulnerability that was mitigated before it could be exploited." 

The outcome is a seamless user experience, where devices operate quickly, securely, and with minimal energy. This philosophy aligns with emerging methodologies for energy and emissions accounting in large-scale data centers, highlighted by Google's location-based Scope 2 Carbon Footprint reporting and the deployment of carbon-aware scheduling to lower emissions.

Jolly’s mindset points to a subtle but essential truth: The greatest technological advances in sustainability may manifest as the invisible, preventative work of engineering teams who keep large systems optimized and resilient without drawing attention.

Principles of sustainable system design

Guiding frameworks are crucial in balancing speed, resilience, and energy use in firmware design for global infrastructure. Jolly asserts, "My approach to sustainable design is guided by the principle that power consumption must always be proportional to the workload. I don't believe in static power profiles; instead, I implement dynamic clock scaling and centralized platform management to ensure every subsystem—whether it's a sensor hub or a massive storage controller—only draws exactly what it needs, when it needs it."

This principle echoes advances in modular, configurable PMU firmware that can be tailored to specific application requirements, supporting precise power management through features such as memory retention modes and clock gating in complex SoCs. By applying these philosophies consistently from embedded devices to infrastructure, system energy use is shaped responsibly and adaptively in real time.

Jolly’s approach affirms a growing consensus in the industry—sustainability must be embedded as a core criterion in the system lifecycle, from the architecture phase through operational maintenance.

Orchestrating power in embedded and SoC systems

Jolly’s experience stretches from constrained embedded contexts to multicore SoCs, offering a comparative vantage on energy optimization strategies. "In embedded devices, optimization is about reactivity—implementing dynamic clock scaling and aggressive sleep modes to support always-on features with minimal drain. In multicore SoCs, it shifts to orchestration—using a central Platform Management Unit to balance power across heterogeneous cores so they don't compete for resources."

This orchestration increasingly relies on modular PMU firmware architectures, which allow for the addition, enabling, or disabling of custom power management modules as system demands evolve. "I see the biggest opportunity in converging reliability with efficiency. By designing firmware that supports self-healing capabilities, we can eliminate the energy-expensive downtime associated with system failures. Sustainable engineering isn't just about running at lower power; it's about keeping the system stable enough to never waste energy on recovery."

As highlighted in recent technical research, the integration of fine-grained multi-core power management controllers and memory subsystems facilitates major reductions in both active and idle power, underscoring the transformative potential of software-driven resource management across architectures.

Ensuring reliability for AI-scale workloads

As artificial intelligence workloads become a dominant force on digital infrastructure, the demands on reliability and efficiency intensify. Jolly highlights the necessity of autonomous system resilience. "I design for stability by building autonomous resilience into the firmware. My experience with heterogeneous SoCs taught me that you cannot rely on a single core to do everything; you need a Platform Management Unit to orchestrate resources and isolate faults. If one subsystem falters under load, it recovers locally without crashing the host."

Her strategy includes the pairing of dynamic clock scaling with autonomous fault isolation, promoting high throughput and continuous uptime even under peak conditions: "Combined with dynamic clock scaling, this ensures the system is resilient enough to handle peak loads and efficient enough to sustain them." This perspective is echoed in both academic reviews of SoC system design, which identify cross-layer resilience and energy-aware control as essential vectors for scalability and sustainable operation, and in technical guidance on using platform-level orchestration for high-reliability multicore systems.

The result is a firmware-driven approach to infrastructure: responsive, adaptive, and robust against an era of ever-expanding compute and data requirements.

Sustainability in architectural decisions

Jolly’s architectural strategy for sustainability rests on proactive partitioning and adaptive design. She explains, "I incorporate sustainability by focusing on intelligent partitioning and adaptability. Early in the architecture phase, I ask: 'What is the lowest-power resource that can reliably handle this task?' My background in heterogeneous computing taught me to push background tasks to dedicated low-power controllers, keeping the main engines asleep. Finally, I design for longevity; I ensure the system remains stable and useful for years, minimizing the need for replacement."

Integrating power and memory controllers that support fast transitions between active, idle, and deep sleep states further optimizes lifetime efficiency—techniques validated in comparative studies of power management for DSPs and emerging SoC platforms. Such strategies not only reduce active energy consumption but also extend the operational lifespan of infrastructure—directly impacting sustainability objectives by lowering the environmental cost of electronics throughout their lifecycle.

These practices align with evolving industry standards, where federated carbon intelligence and advanced measurement techniques are enabling next-generation energy-aware design for large computing fleets, as explored in research on real-time fleet optimization and cloud-scale emission control.

Guiding the next generation of responsible engineers

For Jolly, responsible technology starts with adopting a holistic view of both power and reliability. She advises, "My advice is to stop viewing power optimization as just a battery-saving feature and start viewing it as architectural hygiene. I also advise engineers to equate reliability with sustainability. The most responsible system is one that is robust enough to last and smart enough to consume only exactly what it needs."

This perspective is increasingly recognized as foundational—not only in firmware development but throughout hardware engineering, as the industry invests in robust frameworks for GreenOps, carbon-neutral cloud operations, and global-scale distributed computing. Training and mentoring the next generation to prioritize architectural integrity ensures continuity and acceleration of advanced sustainability practices.

In addition to technical ingenuity, Jolly’s emphasis on reliability over replacement fosters a mindset shift that is needed for truly sustainable infrastructure, supporting the industry’s movement toward a digital future that balances innovation, scale, and responsibility.

The evolution of sustainable firmware design is unfolding in the hands of engineers who embrace complexity, adapt architectures for low energy, and see reliability as a key measure of environmental impact. Jolly’s work demonstrates that the most effective contributions are often invisible—achieved in code, orchestrations, and design choices that quietly shape the future of global technology with every efficient transaction.

\

:::tip This story was distributed as a release by Jon Stojan under HackerNoon’s Business Blogging Program.

:::

\

Tether 与 Fasset 合作,在全球推出首张黄金支持的 Visa 卡及 ATM 机

2026-04-28 14:19:01

Tether, the largest company in the digital asset industry, today announced the launch of the world's first gold-backed Visa neobanking card in collaboration with Fasset, a stablecoin neobanking and investment platform that allows users to receive money, invest, earn, and make payments from anywhere in the world — marking a strategic step towards mainstream use cases for tokenized gold on a global scale.

The card will operate on the Visa network, enabling users to spend USD at all merchant stores worldwide that accept Visa cards, while earning up to 6% cashback in XAU₮ on eligible transactions. Supported by a tiered rewards system that scales with spending, this effectively creates a reward layer tied to Tether's gold-backed assets. Additionally, the card features an automatic round-up function that rounds every transaction to the nearest dollar and auto-invests the spare change in XAU₮, enabling continuous, passive gold accumulation through everyday spending.

Fasset has built a platform that brings together multi-currency accounts, fast transfers, instant settlements, a global debit card, and access to interest-free investments across crypto, stocks, funds, commodities, and more. With a presence across Asia and Africa, they are expanding in some of the world's most dynamic markets, connecting people to tools that make saving, spending, and investing seamless. They also operate as one of the largest digital asset off-ramp providers in their region, ensuring the USD₮-to-fiat conversion layer is highly optimized for speed and reliability.

This initiative will introduce a new global payment model that combines the accessibility of digital assets with the long-standing stability of gold. It also reflects the next phase of gold-backed asset usage by embedding it into familiar financial experiences. The card will be integrated directly with Fasset's wallet infrastructure, with XAU₮ cashback flowing into users' wallets in real time.

Globally, digital assets are increasingly used for practical, real-world financial activities, including payments, remittances, savings, and value preservation beyond trading. Stablecoin exceeds $300 billion in circulation; USD₮ dominates with a market cap of over $186 billion, and annual transaction volumes surpass $33 trillion. Meanwhile, demand is growing for financial tools that combine usability with stability, particularly in emerging markets where currency volatility remains a challenge. Tether continues to expand its role in building infrastructure to meet these needs. As part of the launch, Tether is committing up to $1 million in Tether Gold (XAU₮) to power the card's rewards ecosystem, accelerating the distribution and real-world use of tokenized gold at scale.

"Historically, gold has been a store of value and not a medium of exchange. This changes that narrative," said Paolo Ardoino, CEO of Tether.

"By collaborating with foundational systems that make digital assets practical and accessible globally, we are extending the utility of our ecosystem: connecting stablecoins and tokenized gold to real-world payment systems, giving users the option to hold gold and spend it when they choose without friction or borders."

Mohammad Raafi Hossain, CEO and Co-Founder of Fasset, said:

"For over a thousand years, gold has been the most trusted store of wealth across our markets. We're bringing it into the digital age. With $32 billion in annualized volume,95% of which is held in real-world assets, and the world's first gold-backed neobanking card, Fasset is building the infrastructure to make Tether Gold the most widely held digital gold token in emerging markets. This isn't just a card, it enables the adoption of digital gold at scale through Fasset's extensive distribution network."

Fasset's infrastructure and distribution in high-growth markets, alongside Tether's global liquidity and asset issuance capabilities, enable this collaboration to scale rapidly across regions where demand for stable, asset-backed financial tools continues to rise. Together, the companies are creating a global infrastructure layer for the future of asset-backed banking, connecting thousands of years of gold heritage with modern blockchain technology.

About Tether and USD₮

Tether is a pioneer in the field of stablecoin technology, driven by an aim to revolutionize the global financial landscape and provide accessible, secure, and efficient financial, communication, and energy infrastructure. Tether enables greater financial inclusion and communication resilience, fosters economic growth, and empowers individuals and businesses.

As the creator of the largest, most transparent, and liquid stablecoin in the industry, Tether is dedicated to building sustainable and resilient infrastructure for the benefit of underserved communities. By leveraging cutting-edge blockchain and peer-to-peer technology, it is committed to bridging the gap between traditional financial systems and the potential of decentralized finance.

About Tether Gold (XAU₮)

Tether Gold (Gold) is a digital asset offered by TG Commodities Limited. One full XAU₮ token represents one troy fine ounce of gold on a London Good Delivery bar. XAU₮ is available as an ERC-20 token on the Ethereum blockchain. The token can be traded or moved easily at any time, anywhere in the world, and can be transferred to any on-chain address from the purchaser's Tether wallet, where it is issued after purchase. The allocated gold is identifiable by a unique serial number, purity, and weight, and is redeemable for physical gold.

About Fasset

Fasset is an American-founded global neobanking and investment platform focused on enhancing financial inclusion in emerging markets. With an annualized volume of $32 billion, Fasset is the world's fastest growing asset-backed stablecoin-powered neobank, serving users across 125 countries. Founded by Mohammad Raafi Hossain and Daniel Ahmed, Fasset has raised $26.7 million in funding and holds regulatory approvals across the UAE, Indonesia, Malaysia, the EU, Turkey, Pakistan, and others.

Important Note:

This article is not an offer to sell or the solicitation of an offer to buy Tether Gold (XAU₮). TG Commodities Limited will only sell or redeem XAU₮ pursuant to its gold token terms of sale and service available (as of the date of this press release) at gold.tether.to/legal.

:::info This article is published under HackerNoon Business Blogging program.

:::

\

Solidity 开发者应停止盲目追逐工具,开始精通 ERC 标准

2026-04-28 14:01:49

Most Solidity education focuses on syntax, patterns, and tools. Learn the language, deploy a contract, move on.

That used to be enough.

In 2026, every major ERC standard is its own ecosystem. Some have bundlers, paymasters, and SDKs. Others run compliance modules and identity registries. This isn’t even in the same category as ERC-20. Each standard is its own open system — a full-blown market for smart account tooling and integrations.

Trillions in real-world assets are heading onchain. The developers who capture that opportunity won't be the ones who learned the most tools - they'll be the ones who went deep on the right standards.

That's where the leverage is now.

RWA & Compliance

Institutional blockchain with real volume is already here. Canton Network, a private permissioned chain built by a consortium of banks, processes over $280 billion in daily repo settlements through Broadridge's Distributed Ledger Repo (DLR) platform. Over $3.6 trillion in tokenized real-world assets live on that network today. The catch: it's a closed ecosystem. You don't build on Canton the way you build on Ethereum, you need to be invited.

The open alternative is being written in ERCs. Tokenizing a US Treasury or a real estate fund isn't like deploying an ERC-20. You need investor whitelisting, transfer restrictions, the ability to freeze assets, forced transfers for legal enforcement, and on-chain identity verification. Standard token interfaces have none of that. Two standards exist to fill that gap. One is the full compliance stack, the other is the universal language that lets DeFi talk to it.

ERC-3643 (T-REX)

ERC-3643, known as T-REX (Token for Regulated EXchanges), is the most battle-tested compliance standard on Ethereum. Final status. In production. Originally developed by Tokeny, it's the architecture behind a significant share of institutional tokenized securities issuances today.

At its core are roughly nine contracts: the token itself, an identity registry mapping investor wallets to on-chain identities via ONCHAINID, a modular compliance engine defining the actual transfer rules, a trusted issuers registry controlling which claim issuers are authoritative, and a claim topics registry defining what claims matter: KYC status, accredited investor classification. When a transfer is attempted, the compliance module checks the identity registry. If anything fails — wrong jurisdiction, unverified investor, limit exceeded — the transaction reverts. Compliance logic is pluggable: swap modules for different jurisdictions without touching the token contract.

This is what makes T-REX genuinely useful for institutions. A single token can support multiple transfer restriction regimes simultaneously. If you're planning to build in the RWA compliance space, this is the standard to go deep on first.

Resources ERC-3643

ERC-7943 (uRWA)

ERC-3643 solves compliance for issuers. ERC-7943 solves a different problem: DeFi protocols have no generic way to interact with compliant RWA tokens.

If a lending protocol wants to accept a tokenized treasury as collateral, how does it check whether a transfer is allowed? How does it detect frozen assets? Every RWA implementation today has its own interface. Protocols either build custom integrations per issuer, or ignore compliant assets entirely. Neither scales.

ERC-7943 — the Universal RWA Interface — proposes a minimal, implementation-agnostic surface that any RWA token can expose regardless of what compliance system it uses underneath: canTransact, canTransfer, getFrozenTokens, forcedTransfer, setFrozenTokens. Five functions. No opinion on identity systems, no mandated role structures, no specific KYC provider.

The result: a DeFi protocol implementing ERC-7943 support can interact with any compliant RWA token — whether it's built on T-REX, a custom compliance system, or anything else. The two standards are complementary rather than competing. A T-REX token can expose the uRWA interface as its public-facing integration surface.

Important note: as of April 2026, ERC-7943 is in Last Call, not Final. The architecture is being built against it already, but mention that to any institutional counterpart you're speaking with.

Resources ERC-7943

Vaults & Yield

It used to be that every DeFi protocol that handled yield had its own vault interface. Yearn had one. Aave had another. Compound had its own. Building an aggregator or any product that touched multiple protocols meant writing custom adapters for each one, getting each audited separately, and maintaining all of it when any protocol upgraded. That was the state of DeFi yield infrastructure as recently as 2022.

ERC-4626 fixed that. Over $15 billion in vault TVL across more than 2,700 vaults on Ethereum mainnet alone runs on this standard today. Yearn, Morpho, Euler, Ondo, Centrifuge and many other projects adopted it.

ERC-4626 (Tokenized Vault Standard)

ERC-4626 is a simpler standard than ERC-3643 in terms of surface area. It extends ERC-20 and adds a standardized interface for vaults: how you deposit assets, how you receive shares, how you redeem, how share-to-asset pricing is calculated. That's the whole thing. But the composability that comes from that simplicity is the point.

When every vault speaks the same interface, a protocol integrating ERC-4626 support once can work with every compliant vault automatically. Yield aggregators, collateral managers, lending protocols — they all plug into the same surface. Vault shares are ERC-20 tokens, which means they can be traded, used as collateral, or deposited into other vaults. The ecosystem this created is what drove the TVL numbers above.

The standard also solves something less obvious: consistent share pricing. Before ERC-4626, the exchange rate between a vault share and its underlying asset was calculated differently everywhere, which made it difficult for external protocols to reason about vault value without bespoke logic. convertToShares and convertToAssets standardize that math across every implementation.

One security consideration worth knowing: the inflation attack. A malicious first depositor can manipulate share pricing by donating assets directly to the vault before any real deposits. OpenZeppelin's implementation and most production vaults address this with virtual shares, but it's a pattern every developer building a new vault should understand before deploying.

Resources (ERC-4626)

ERC-7540 (Asynchronous ERC-4626 Vaults)

ERC-4626 assumes that deposits and redemptions settle in the same transaction. That works for liquid DeFi strategies but breaks immediately when you try to apply it to real-world assets. A tokenized treasury fund can't redeem in one block — it has T+1 settlement. A real estate fund may need compliance checks before accepting a new investor. A private credit vault needs time to deploy capital off-chain.

ERC-7540 extends ERC-4626 with a request-and-claim pattern. Instead of depositing and receiving shares atomically, a user submits a request. The vault processes it asynchronously — running compliance checks, settling off-chain, computing NAV — and then the user claims their shares in a separate transaction when the vault is ready. The existing ERC-4626 interface is preserved for the claim step, meaning protocols that support ERC-4626 can support ERC-7540 with minimal additional work.

This is the standard that makes ERC-4626 actually usable for institutional RWA products. Lagoon, one of the larger vault infrastructure providers building across 18+ EVM chains, runs their entire stack on ERC-7540 for exactly this reason. The pattern also supports forward pricing, where the exchange rate is determined at settlement time rather than at request time — a requirement for many regulated fund structures.

ERC-7540 reached Final status. If you're building anything vault-adjacent that involves real-world assets, off-chain settlement, or compliance-gated access, start here rather than trying to bend ERC-4626 into something it wasn't designed for.

Resources (ERC-7540)

Account Abstraction

Every user on Ethereum today carries the same burden: a seed phrase, an ETH balance for gas, and a single point of failure. Lose the key, lose everything. Want to pay gas in USDC? Not possible. Need a corporate wallet with multi-sig approval flows and spending limits? Build it yourself from scratch.

Account abstraction changes this. The idea has been on Ethereum's roadmap since the earliest days, but previous proposals all required consensus-layer changes that proved too politically difficult to ship. ERC-4337 took a different approach entirely: no protocol changes, no hard fork dependency. Just a higher-layer architecture built on top of what already exists.

ERC-4337 (Account Abstraction)

ERC-4337 went live on Ethereum mainnet on March 1, 2023. Over 40 million smart accounts have been deployed since, with nearly 20 million in 2024 alone. The standard has processed over 100 million UserOperations, a tenfold increase from 2023. Entire companies, Biconomy, Alchemy, Pimlico, have been built around its infrastructure.

The architecture has six core pieces worth understanding.

A UserOperation is the fundamental unit. It's not a transaction in the traditional sense. It's a pseudo-transaction object that describes what a user wants to do, including custom authentication logic, gas payment preferences, and batched calls. Users sign UserOperations, not raw transactions.

Bundlers are off-chain infrastructure that monitor a dedicated alt-mempool where UserOperations are submitted. A bundler collects multiple UserOperations, packages them into a single on-chain transaction, and submits it to the EntryPoint contract. Bundlers are the only participants in the ERC-4337 ecosystem that still need an EOA.

The EntryPoint is a singleton smart contract deployed at the same address across every EVM network. It is the trust anchor for the whole system. It receives batched UserOperations from bundlers, calls each account's validateUserOp function to verify authentication, and executes the operations if validation passes.

The Smart Account is the user's actual account contract. It implements validateUserOp with whatever logic the developer chooses: standard ECDSA, passkeys, multisig, biometric authentication, time-locked spending, session keys. The account defines its own security model.

Paymasters are optional contracts that cover gas on behalf of users. An application can deploy a Paymaster and fund it with ETH, letting users transact without ever holding native gas. This is what makes genuinely gasless onboarding possible, not just in theory but in production.

Aggregators handle signature aggregation for accounts that use schemes like BLS, where multiple signatures can be collapsed into one for gas efficiency.

One significant development to mention for readers keeping up: EIP-7702, shipped with Ethereum's Pectra upgrade in May 2025, extends account abstraction to existing EOAs without requiring users to migrate to new addresses. EIP-7702 is complementary to ERC-4337, not a replacement. EOAs with EIP-7702 delegation can use the same bundler and paymaster infrastructure. The account abstraction stack now serves both new smart accounts and upgraded legacy EOAs.

Resources (ERC-4337)

ERC-6900 (Modular Smart Accounts)

ERC-4337 standardizes how accounts interact with the EntryPoint. What it does not standardize is what goes inside the account itself. ERC-6900 defines a standard interface for modular smart accounts: how validation modules attach, how execution modules hook in, how permissions are scoped. A plugin built to ERC-6900 spec can be installed into any compliant account, regardless of who built the account.

The standard is currently in Draft, not Final, which is worth knowing. But it already has meaningful adoption. Alchemy's Modular Account V2 is built on ERC-6900. The Ethereum Foundation's Pectra documentation explicitly calls out ERC-6900 as the standard mechanism for dApps to access smart account capabilities. For institutional deployments where audit reusability and module composability matter, ERC-6900 would be a solid choice.

There is also ERC-7579, a more minimalist alternative to ERC-6900 that standardizes the same interfaces with fewer constraints. The two coexist and there is active debate in the community about which approach wins long-term. Both are worth understanding; which you build against depends on whether you need ERC-6900's comprehensive framework or ERC-7579's lighter surface area.

Resources (ERC-6900)

AI Agents

AI agents are no longer just chatbots. They execute transactions, and coordinate with other agents but how does one agent trust another it has never interacted with before?

Off-chain, this problem is solved by centralized platforms. The same centralization problem that Ethereum solved for tokens and DeFi now applies to the emerging agent economy.

Two ERCs address different layers of this problem. ERC-8004 is the identity and reputation layer. ERC-8001 is the coordination primitive. Together they define what it means for agents to work together on-chain in a trustless way.

ERC-8004 (Trustless Agents)

ERC-8004 was proposed in August 2025 by a group that reads like a who's who of the agent stack: Marco De Rossi from MetaMask, Davide Crapis from the Ethereum Foundation, Jordan Ellis from Google, and Erik Reppel from Coinbase. It went live on Ethereum mainnet in January 2026. The architecture is three lightweight registries, designed to be deployed as singletons per chain.

The Identity Registry is built on ERC-721. Each agent gets minted an NFT whose tokenURI points to a registration file, an agent card containing the agent's name, capabilities, supported protocols (MCP, A2A, ENS, DID), and payment address. The on-chain component anchors identity. The off-chain card provides context. Identifiers follow CAIP-10 so agents have globally unique, chain-agnostic addresses that work consistently across networks.

The Reputation Registry standardizes how feedback gets posted and queried on-chain. A client that hired an agent can post structured feedback tied to that agent's identity token. Scoring and aggregation happen both on-chain for composability and off-chain for more sophisticated algorithms. The design explicitly expects a layer of specialized reputation aggregators, auditor networks, and insurance pools to emerge on top.

The Validation Registry handles heavier trust requirements. It provides hooks for requesting and recording independent third-party verification, whether that's cryptographic proofs, TEE attestations from trusted execution environments, or economic staking. High-value agent interactions can require validator sign-off before execution.

One technical thing worth noting for developers: ERC-8004 deliberately excludes payment mechanics. That is handled by x402, an HTTP-layer payment protocol governed by Coinbase and Cloudflare that lets agents pay for services automatically as part of normal request-response flows. ERC-8004 plus x402 is the two-piece stack most agent infrastructure is being built on.

Status flag: ERC-8004 is on mainnet and actively deployed, but formally still carries Draft status in the EIP repository. The spec is stable enough to build on, and the Ethereum Foundation's dAI team has it on their 2026 roadmap, but flag this when explaining it to anyone who asks about formal ratification.

Resources (ERC-8004)

ERC-8001 (Agent Coordination Framework)

ERC-8004 answers "how do agents identify and trust each other," ERC-8001 answers "how do multiple agents agree to act together without a coordinator in the middle."

The problem it solves is specific. Existing intent standards on Ethereum, ERC-7521, ERC-7683, and others, define single-initiator flows. One party wants something done and broadcasts an intent. That covers most DeFi use cases but breaks when multiple agents need to jointly commit to a coordinated action before any of them executes. A DeFi trading strategy that requires three agents to simultaneously agree before entering a position. A DAO treasury rebalance that needs sign-off from multiple autonomous actors. A multi-agent system that should only fire when all participants have confirmed their legs are ready.

ERC-8001 defines the minimal primitive for that: an initiator posts an AgentIntent signed with EIP-712, the required participants each submit an AcceptanceAttestation, and the intent is executable only once all valid acceptances are present and unexpired. Replay protection comes from EIP-712 domain binding and monotonic nonces. Everything is compatible with both EOA and smart contract wallets via ERC-1271.

The standard is deliberately narrow. Threshold policies, bonding, privacy, and cross-chain semantics are all explicitly out of scope, expected to be added as optional modules on top. This is the same design philosophy as ERC-4626: define the minimal interface that everyone can agree on and let the ecosystem build the rest. ERC-8001 reached Final status but has lower adoption than ERC-8004.

Resources (ERC-8001)

The standards in this article are not equally mature, equally complex, or equally close to production money. But all of them are pointing at the same thing: Ethereum is becoming the settlement layer for things that actually matter. The developers who understand this infrastructure deeply, not just how to write Solidity, but how these systems are designed and why, are going to be the ones building it.

I'm personally working on tooling and monitoring for ERC-3643 deployments. I believe the compliance layer of on-chain finance is the highest-leverage place to be right now.

\

家用路由器WPS功能背后的隐藏安全风险

2026-04-28 14:00:52

\


\ If you’ve been following my previous articles on WPA networks, you’ll know the major problem remains weak user passwords. It’s evident that humans simply struggle to create strong, secure ones, and even when they do, typing a 15-character password every time is especially annoying on a smart TV, where typing with a remote is far from ideal.

\


Want to learn more about passwords, check out my other article:

https://hackernoon.com/youve-learned-to-break-wi-fi-now-learn-to-lock-it-down?embedable=true


\

The Wi-Fi Alliance had a solution for this as early as 2006.

\ The Wi-Fi Alliance reported that 60–70% of users did not configure their routers properly, leaving default credentials, failing to choose secure encryption, or even having no password at all.

For some, this stemmed from a lack of awareness; they didn’t know how, while others neglected it for convenience. This widespread issue prompted vendors to develop their own methods for simplifying Wi-Fi setup without placing too much technical burden on users.

However, these approaches led to compatibility problems. As a result, in 2006, the Wi-Fi Alliance introduced a standardised method for securely setting up and connecting to WPA/WPA2 Wi-Fi networks, known as WPS (Wi-Fi Protected Setup).

WPS made it possible to connect new devices without ever typing the long password. It was a highly convenient feature; however, as is often the case, convenience comes at the cost of security, and this was no exception.

In my previous articles, I mentioned completely disabling the WPS feature, and in this article, I will explain why.

Let’s start with the behind-the-scenes of WPS…

\


\ WPS provides a structured framework that enables easy, secure setup and management of wireless networks. You no longer have to enter the actual password to connect a new device.

\ But how?

\ To make it easier for non-technical users, WPS utilises the “Lock and Key” mental model.

Let’s use a simple example to illustrate this before we dive into the technical details. Imagine yourself as an Airbnb owner; you frequently rent your house, and I am a traveller staying at your place.

I introduce myself, and you successfully identify me after checking my documents. Once you verify my identity, you hand over the key, which I use to unlock the door and access your house.

WPS works the same way; it hands over the WPA credentials once you prove your authenticity.

\

The Skeleton (The Architecture)

\

Components

\ Three logical actors drive this entire process:

  • Enrollee: A device seeking to join the network (e.g., phone, printer, desktop).
  • Registrar: The device authorised to issue and revoke WLAN (Wireless Local Area Network) credentials.
  • AP: The Access Point (router), which provides connectivity and acts as a proxy.

Even though these three components remain logically separate, they often physically coexist. For example, an access point frequently incorporates the registrar, or the registrar may exist as a separate device. In the case of an external registrar, like a PC or phone, it can even coexist with the enrollee; your PC could act as both at the same time.

The most common and simplest setup features a standalone configuration where your AP includes an in-built registrar.

Interfaces

Depending on the data flow between these three components, the architecture utilises three interfaces:

1. Interface A

  • This interface connects the AP and the Enrollee. Its primary function involves enabling the discovery of Wi-Fi Protected Setup WLANs and facilitating communication between the Enrollee and Registrars via WLAN or Ethernet (using UPnP).

    \

:::info \ The WPS IE (Information Element) management frame provides the discovery information. According to the specification, this information serves merely as a hint and remains unauthenticated; therefore, users should not trust it.

\ :::

\ 2. Interface E

  • This interface sits between the Enrollee and Registrar. It enables the Registrar to discover the enrollee and issue WLAN credentials. Here, the AP can physically act as a proxy to convey messages; this interface mainly uses WLAN communication or another out-of-band channel.

3. Interface M

  • Interface M links the AP and the Registrar. It allows an external registrar to manage a WPS AP, using the same registration protocol used for issuing credentials.

While this covers the architecture or the skeleton, the core functionality of WPS lies in the Registration Protocol.

\

Registration Protocol

\ The Registration Protocol functions as a three-party in-band protocol to assign a WLAN Credential to the Enrollee. It operates between the Enrollee and the Registrar using mutual authentication and may receive support through a proxy (AP).

WPS utilises two main operating modes: in-band and out-of-band. The Registration Protocol can run entirely in-band, entirely out-of-band, or through a combination of both (hybrid).

\ 1. In-Band

In-band refers to communication within the same channel. In this context, communication between devices within the WLAN. This configuration performs a Diffie–Hellman key exchange, authenticating it with a shared secret called a device password.

Users obtain the device password from the Enrollee and enter it into the Registrar manually via keypad, USB flash drive, or NFC in a hybrid setup.

  • PBC (Push Button Configuration): This method offers the simplest but least secure configuration, triggered by pressing a physical or logical button on the enrollee or registrar. Upon activation, the enrollee actively searches for a registrar in PBC mode. Once it identifies a registrar, the protocol begins. Alternatively, pressing the button on the registrar triggers a 120-second scan for enrollees known as the “walk time” (the time it takes to walk up to your router). To avoid session overlap, both devices terminate the session if they detect more than one registrar or enrollee in PBC mode.
  • PIN (Personal Identification Number): This method requires manual entry of the Device Password or PIN. Devices generally fall into three categories:
  1. Headless devices: These lack a display and use a static 8-digit PIN printed on the hardware (like most home routers).

  2. Devices with displays: These generate a dynamic PIN for each session, varying between 4 and 8 digits.

  3. Hybrid mode: This uses NFC or USB flash drives to deliver strong passwords instead of a standard PIN.

    \

2. Out-Of-Band (OOB)

OOB refers to communication via a separate, dedicated channel, such as a physical Ethernet connection (UPnP). The goal involves sending WLAN credentials and configuration across this out-of-band channel to the enrollee. The out-of-band channel offers optional encryption for these settings.

Currently, WPS supports two out-of-band channels: USB flash drives and NFC.

  • USB Flash Drive: This process remains simple: plug the USB into the external registrar or AP; the registrar writes the credentials to the drive, which you then plug into the enrollee to establish the connection.

  • NFC (Near Field Communication): This contactless technology enables short-distance communication up to 10 cm. It provides a highly secure peer-to-peer option. When a user touches the NFC device to the AP, the devices exchange configurations and Diffie-Hellman public keys via the encrypted NFC channel. This encryption and short physical distance render man-in-the-middle attacks infeasible.

    \

WPS provides three options for out-of-band configurations:

1. Unencrypted Settings: This option places the WLAN credentials unencrypted onto the out-of-band media. It relies on the assumption that the user maintains physical control over the media (like the NFC token or USB drive).

  • Advantages: You can reuse the media with new enrollees without running the registrar again, and it supports legacy APs that cannot forward public keys.
  • Disadvantage: This convenience compromises security; if an attacker steals the media, they obtain the credentials immediately.

2. Encrypted Settings: This option employs a key derived from the enrollee’s public key (obtained in-band) and the registrar’s key to encrypt settings for that specific enrollee. This ensures the media only works for one device. Even so, users should still physically guard the media.

3. NFC Peer-to-Peer Mode: This option boasts the strongest security properties. In this mode, the interface performs a 1536-bit Diffie-Hellman exchange and delivers WLAN settings encrypted with 128-bit AES. Because the devices receive this data over the NFC channel, they implicitly authenticate the keys and settings.

You don’t have to choose just one; in hybrid setups, the initial trust occurs via OOB methods (NFC/USB) while the registration protocol happens via in-band WLAN. This proves the flexibility of the WPS architecture, whether in standalone or hybrid configurations.

\ Now, let’s look into the core of the registration protocol:

\ The registration protocol follows a lock-step model where everything occurs sequentially. Each step requires success before the process proceeds to the next. This 8-step sequence enables the Enrollee and Registrar to authenticate each other and issue WLAN credentials.

In short, these 8 steps fall into two phases:

  • Discovery: This phase starts when a user manually enters a password (obtained via display or label) into the Registrar. While waiting for the password, the registrar sends an M2D message containing its description to the enrollee. This allows the enrollee to identify and choose the correct Registrar.

  • Mutual Authentication and Issuing Credentials: Protocol messages M3-M7 incrementally demonstrate that both sides know the device password. Once both sides prove this knowledge, they exchange encrypted configuration data. Message protection relies on a key derivation key (KDK), which the system computes from the Diffie-Hellman secret, nonces, and the Enrollee MAC address.

    \

Now, let’s dissect the 8 steps further:

\ 1. *M1 (Enrollee - Registrar): \ The enrollee sends its description (including *MAC*, *UUID-E*, and *device capabilities**), its *1536-bit Diffie-Hellman public key (PKE)*, and a *128-bit random nonce (N1)*.

2. M2 (Registrar - Enrollee): \ The registrar responds with its own description,public key (PKR)**, a *random nonce (N2)*, and an **Authenticator (the *HMAC-AuthKey)*.

\

:::info \ M2D: If a Registrar does not yet know the Enrollee’s PIN, it sends M2D. This discovery-only variant omits the public key and authenticator to inform the Enrollee of its presence without performing expensive cryptographic operations.

\ :::

\ 3. **M3 (Enrollee - Registrar): \ The enrollee sends *E-Hash1* and E-Hash2 as pre-commitments. These prove knowledge of the first and second halves of the device password (PIN) without revealing the digits immediately.

E-Hash functions as an HMAC (Hash-based Message Authentication Code) that locks together three elements:

  • A secret random nonce (E-S1 or E-S2)
  • Half of the PIN (PSK1 or PSK2)
  • The public keys

At this stage, the Registrar can see the hashes but cannot verify them yet, as it lacks the secret nonces required to complete the check.

4. **M4 (Registrar - Enrollee): \ The Registrar sends its own pre-commitments (R-Hash1 *and *R-Hash2) to prove knowledge of the PIN. It also includes its first secret nonce (R-S1**), allowing the Enrollee to verify that the Registrar knows the first half of the password.

5. **M5 (Enrollee - Registrar): \ The Enrollee sends its first encrypted secret nonce (E-S1). The Registrar uses this *E-S1* and the first half of the user-entered PIN to re-calculate E-Hash1. If the result matches the hash from M3, the first half of the password is officially verified. If verification fails, the protocol terminates immediately to prevent brute-force attacks.

6. **M6 (Registrar - Enrollee): \ The Registrar sends its second encrypted secret nonce (R-S2), allowing the Enrollee to verify the Registrar’s knowledge of the second half of the PIN.

7. **M7 (Enrollee - Registrar): \ The Enrollee sends its second encrypted secret nonce (E-S2) as its final proof of identity. The Registrar performs a final match against *E-Hash2* from message M3 to confirm the Enrollee knows the second half of the password.

8. **M8 (Registrar - Enrollee): \ This message marks the culmination of the protocol — the moment the “key” to the network finally changes hands. Having fully authenticated the Enrollee, the Registrar sends the WLAN Credentials (SSID *and *Pre-shared Key**).

\

:::info \ The KDK (Key Derivation Key) encrypts each of these messages.

\ :::

\ The registration protocol utilises EAP (Extensible Authentication Protocol) to transmit these messages. EAP employs the Wi-Fi Simple Configuration (WSC) method to enable the registration protocol; WSC serves as the core technology of WPS.

Now that we have a basic overview of the behind-the-scenes, we can get a closer look at why we disable WPS.

\


\

The Vulnerability

\ This article focuses on the vulnerabilities associated with in-band PIN authentication, while also providing a brief overview of other weaknesses in both in-band and out-of-band methods.

The core security principle behind WPS states:

\

The security of a system remains only as strong as its weakest component.

\ In other words, the weakest link in your architecture determines your overall security — a point that becomes much clearer as we walk through the vulnerabilities.

While the WPS architecture offers flexibility, its weakest component remains the PBC method. PBC provides zero entropy by using a “null PIN” (all zeroes) and omits authentication entirely, despite being the most convenient option.

\ But how does an attacker exploit this?

\ Two main vulnerabilities exist here:

  1. If you press the PBC on the registrar first, and an attacker activates PBC on their enrollee before you do, they gain access to your network.

  2. An attacker can set up a rogue AP and jam the signal from your actual registrar using a deauthentication attack. Since your device lacks a way to authenticate the legitimate AP, it connects to the attacker’s rogue registrar instead.

    \

While the NFC peer-to-peer connection serves as the strongest option, it still carries the risk of theft. Out-of-band methods rely on implicit authentication based on possession; if an attacker steals your USB or NFC token, they compromise the network.

\ This leaves us with the PIN method. Devices with displays that generate dynamic PINs aim to provide the intended security, whereas labelled or fixed PINs remain susceptible to active attacks like brute-forcing. Ironically, the lock-step design intended to resist brute force actually facilitates both online and offline cracking.

\

Online Cracking

\ In this scenario, the attacker engages with a registrar to obtain the Diffie–Hellman keys and then tests different combinations. Whenever a half fails, the device responds with a WSC_NACK message. This feedback notifies the attacker of every failed attempt. Another major factor stems from a design flaw in PIN splitting—ironically intended to prevent brute-force attacks, it actually simplifies them.

\ An 8-digit PIN (10⁸) offers 100 million possible combinations. At a speed of one second per attempt, completing all combinations would take approximately 1,157 days. Even finding the password halfway through would still require around 578 days.

\ Not practical, right?

\ But PIN splitting makes it highly practical. Rather than attacking all 8 digits at once, an attacker cracks each half independently — 10,000 combinations (10⁴) for the first 4 digits, and only 1,000 (10³) for the second 3 digits.

\ Wait, shouldn’t it be second 4 digits?

\ Since the 8th digit is a checksum of the first seven, not a free variable, this leaves only 3 meaningful digits to crack in the second half. That collapses the search space from 100 million for the 8 digits down to just 7 digits with 11,000 possibilities, reducing the crack time to somewhere between 1.5 and 3 hours.

\ To mitigate this, vendors should implement lockouts. These trigger the AP to terminate the session after repeated failures, but the majority of vendors neglect this, adding an implementation flaw to the mix.

\

Offline Cracking

\ The offline attack, widely known as Pixiedust, represents a more specialised version. It requires only the messages containing E-S1 and E-S2 (secret nonces of the Enrollee). This vulnerability exists because some chipsets rely on weak pseudo-random number generators (like the Rand() function from C). With only a 32-bit state and no external entropy, these nonces become easy targets. This constitutes an implementation flaw rather than a design flaw. Pixiedust works far faster than online cracking; since these secret nonces help encrypt the PIN, cracking them exposes the PIN immediately.

\

:::warning \ Brute force also creates another major issue: resource exhaustion. Repeatedly engaging with the registration protocol and computing Diffie-Hellman keys strains the AP’s CPU. This prevents legitimate devices from gaining access and, in the worst case, crashes the router.

\ :::

\ Finally, some vendors leave WPS enabled even after a user manually disables it — yet another common implementation flaw.

Now that you understand the underlying vulnerabilities, let’s see how an attacker applies this practically.

\


\

:::warning \ Disclaimer: Everything shown in this blog was performed within legal boundaries and with full authorization from the network owner. This content is strictly for educational purposes. The author does not condone or take responsibility for any misuse of the techniques demonstrated.

\ :::

\


The Attack

\ The kill chain follows a simple path:

  1. **Reconnaissance \ You cannot attack what you cannot see. I will utilise the Wash Wi-Fi analyser tool to discover nearby WPS-capable networks.

  2. **Attack \ Once I identify the target, I will employ Reaver, a WPS cracking tool, to perform both online and offline attacks. This demonstration highlights the offline (Pixiedust) method.

    \

But first, let’s set up the environment..

\ Install Reaver:

Since the latest version of Reaver includes both Wash and Pixie Dust, you do not need to install them separately

\

sudo apt install reaver

\ installing reaver

\ \ A wireless adapter with monitor mode remains necessary to carry out this attack. Since I am using my Raspberry Pi, I will first confirm the adapter connection:

\

lsusb

\ lsusb

\ \


\ A comprehensive guide on setting up a Raspberry Pi Zero W:

https://hackernoon.com/setting-up-pi-zero-for-pi-fi-hacking?embedable=true

\


\ Now, it’s time to switch the wireless adapter interface to monitor mode.

\ First, find the interface:

\

iwconfig

\ iwconfig

\ \ Switch to monitor mode:

\ For ease of use, you can utilise airmon-ng from the Aircrack-ng suite to activate monitoring mode

\ Install aircrack-ng (this includes airmon-ng):

\ \

sudo apt install aircrack-ng

\ installing aircrack-ng

\ Monitor mode:

\

sudo airmon-ng start <interface>

\ airmon-ng start <interface>

\ \ Alternatively, you can configure it manually using iwconfig and ifconfig:

\

ifconfig <interface> down # take the interface offline
iwconfig <interface> mode monitor # switch to monitor mode
ifconfig <interface> up # bring the interface back online

\ iwconfig monitor mode

\ \ Verify that the interface operates in monitor mode rather than managed mode:

\ \

iwconfig <interface>

\ iwconfig <interface>

\ \ Now, it’s time to commence the reconnaissance stage…

\

Recon

\ Wash is a utility to discover WPS-capable networks.

\ By default, the tool passively surveys nearby networks by capturing broadcasts on the live interface:

\

sudo wash -i <interface>

\ wash survey

\ This displays several columns in the output:

  • BSSID: The MAC address of the AP.
  • CHANNEL: The operating channel of the AP.
  • WPS VERSION: The supported WPS version.
  • WPS LOCKED: The current lock status reported by broadcast packets.
  • ESSID: The name (SSID) of the AP.
  • dBm: Signal strength; lower numbers indicate closer proximity to the target.

\ You can also gather more information using an active scan, which sends probe requests to all nearby networks. Note that this method appears noisier and less stealthy:

\

wash -i <interface> --scan

\ wash scan with json output

\ \ Additionally, applying --json provides deeper details about WPS firmware and other metadata in a JSON format.

\ Now that I have the BSSID and channel of the target, I can commence the attack

\

Attack

\ Reaver performs a brute-force attack against the WPS PIN of an AP.

\ \ Online Attack

\ This method requires a traditional brute-force attack on both PIN halves by actively engaging with the registration protocol. During the attack, my impatience grew as the process dragged on, eventually triggering a WPS lockout and extending the duration even further. This approach proved the least efficient for cracking the WPS PIN.

\

reaver -i <interface> -b <target_bssid> -vv

\ reaver default attack

\ \ The -vv flag enables verbose mode, providing more detail on the background processes. Modern routers detect brute-force attempts and trigger a lockout state; consequently, Reaver stops the attack after ten consecutive errors. It then waits 60 seconds before re-checking the router status and resuming once the lockout resets. While several methods claim to avoid lockouts or force resets, they remain unreliable as they depend entirely on specific firmware.

\

:::tip \ You can also apply --dh-small to speed up the attack. This uses smaller Diffie-Hellman keys to reduce calculation time.

\ :::

\ Note that the online method can trigger a DoS (Denial of Service) and disrupt connections for legitimate devices. Each attempt requires recalculating Diffie–Hellman keys, which strains the AP’s CPU.

\ Offline Attack (Pixie Dust)

\ This represents the fastest and most reliable method from my tests. It utilises Pixie Dust, now integrated into the latest Reaver version. This attack exploits weak secret nonces generated by functions like Rand(). While this vulnerability affects only specific chipsets, it remains a major flaw across many vendors. Because it only requires E-Hash1 and E-Hash2, the attack operates entirely offline after the initial capture, without further participation in the registration protocol.

\ \

sudo reaver -i <interface> -b <target_bssid> -K -vv

\ Reaver pixiedust attack

\ \ -K is to specify pixiedust attack mode.

You can see how it captured all the necessary components:

  • Seed N1: The seed for generating N1 (the random nonce of the enrollee)

  • Seed ES1: The seed for generating E-S1 (the first secret nonce of the enrollee)

  • Seed ES2: The seed for generating E-S2 (the second secret nonce of the enrollee)

  • PSK1: The first half of the PIN.

  • PSK2: The second half of the PIN.

  • ES1: The first secret nonce of the enrollee.

  • ES2: The second secret nonce of the enrollee.

  • WPS PIN: The cracked WPS PIN.

    \

The seed is the starting point for the pseudo-random number generator. A weak seed is what makes many chipsets vulnerable, some are known to use Unix timestamps as seeds, which are predictable and trivially guessable by an attacker.

\ Now we have the WPS PIN and the PSK for the Wi-Fi network.

\ pixiedust success

\ This method proved extremely efficient, taking only 9 seconds to crack the PIN compared to the hours required for an online attack.

\ To compare the two, let’s return to the Airbnb analogy. You rely on a specific piece of information to verify me as the actual customer — a shared secret between us. In online cracking, I keep knocking on your door and shouting different secrets until I hit the right one. This remains loud and unreliable.

\ In offline cracking, however, I examine your door’s lock, discover its weak internal mechanics, and take measurements. I then go home, build a matching key, and use it the next day to enter your home effortlessly. Scary, right? This demonstrates exactly what happens when you enable WPS without realising the risk.

\


\

Conclusion

\ The WPS PIN functions as intended for devices with displays that generate dynamic PINs. The vulnerability emerges primarily when headless devices, such as routers, use static PINs. Furthermore, attacks like Pixie Dust bypass design flaws to exploit implementation errors from the vendor’s side.

\ Overall, the principle ‘the security of a system remains only as strong as its weakest component’ best explains the failure of WPS. The specification itself anticipated many of these risks — but by offering a convenience-first option and leaving critical safeguards to vendor discretion, the architecture was only ever as secure as its weakest implementation.

\ The modern world categorises WPS as deprecated functionality that users should no longer employ. Despite this, legacy APs in small businesses and homes still widely rely on it.

\ Once we install a router, most of us neglect the configuration. Yet, regardless of how convenient vendors make it, the responsibility to configure and maintain a secure network ultimately falls on the user.

\ With this, we have now seen how WPS works and how attackers exploit it.

\ Until next time, stay safe…

\

支持本地化人工智能的理由从未如此充分

2026-04-28 13:59:41


:::tip This is not clickbait.

All estimates and benchmark scores here are based on real, publicly available data!

:::

Section 1: The API Bill Is Due: Why Running AI Locally Is Now a Financial Decision

Last month, a founder I mentor sent me a screenshot of his OpenAI billing dashboard.

The number was $2,847.

For a single month. For a two-person startup.

His product was barely in beta!

I have been there.

We all have.

If you are building anything serious with AI in 2026 — a product, a research pipeline, an agentic workflow, even a personal productivity stack — you have almost certainly felt that specific, sinking feeling when the invoice arrives.

You stare at it, do the math on what Year 2 looks like, and quietly start reconsidering your architecture.

Here is the thing that people all over the world are starting to realize: you no longer have to pay it.

The generation of open-weight models that shipped in early 2026 has closed the benchmark gap with Claude Opus-class performance for the vast majority of professional use cases.

Kimi K2.6 scores 80.2% on SWE-Bench Verified— Claude Opus 4.6 scores 80.8%. \n GLM-5.1 achieves94% of Claude Opus 4.6’s coding performanceat a fraction of the cost. \n MiniMax M2.7 delivers56.22% on SWE-Bench Pro with only 10B activated parameters — 94% of GLM-5.1’s performance at roughly one-fifth the API cost.

And that is before you consider running them locally.

Which is exactly what this article is about.

Because. It . Is. Almost. Free!


The Pricing Landscape in April 2026

Let me show you the numbers side by side. These are current, verified rates as of April 2026.

| Model | Provider | Input ($/M tokens) | Output ($/M tokens) | Blended (3:1 ratio) | License | |----|----|----|----|----|----| | Claude Sonnet 4.6 | Anthropic | ~$3.00 | ~$15.00 | ~$6.00 | Proprietary | | GPT-5.4 | OpenAI | ~$2.50 | ~$10.00 | ~$5.00 | Proprietary | | Gemini 3.1 Pro | Google | ~$1.25 | ~$5.00 | ~$2.50 | Proprietary | | Kimi K2.6 | Moonshot AI | $0.95 | $4.00 | $1.71 | Open weights | | GLM-5.1 | Z.AI | Comparable to Kimi | — | — | Closed weights | | MiniMax M2.7 | MiniMax | $0.30 | — | — | Closed weights |

Sources: Artificial Analysis — Kimi K2.6Atlas Cloud comparisonTokenMix

That is already a compelling delta.

But Kimi K2.6’s cached input drops to $0.16/M tokens for agent workloads with stable system prompts.

For multi-turn agentic pipelines, the effective input cost can fall to $0.03–0.07 per MTok — territory that renders the proprietary premium genuinely indefensible for most workloads.

And for truly local inference?

The marginal token cost is zero!

As a market-level data point: LLM API prices dropped approximately 80% from 2025 to 2026. The direction of travel is unmistakable.


The True Cost Over Time

Let me make this concrete with a real cost model.

Assume a power user or small team generating 200,000 output tokens per day — roughly 6M output tokens per month.

That is a moderately busy coding assistant, research pipeline, or content workflow.

| Scenario | Year 1 | Year 2 | Year 3 | Total | |----|----|----|----|----| | Proprietary API only (Claude Sonnet blended $6/M) | $43,200 | $43,200 | $43,200 | $129,600 | | Hybrid (50% local, 50% API) | $21,600 + ~$2,000 hardware | $21,600 | $21,600 | $66,800 | | Fully local (M5 Ultra amortised over 3 yrs) | ~$1,333 hardware/yr + power | $1,333 | $1,333 | ~$4,500 |

The break-even on a $4,000 Mac Studio M5 Ultra versus full proprietary API spend arrives in under 6 weeks at these usage levels.

At more modest usage (50K output tokens/day), it is still under 6 months.

This is not even close.

But the financial case is only half the story.

The other half is data sovereignty, latency, and control.

When your AI runs locally, your proprietary code, client data, and internal documents never leave your machine.

No Terms of Service to audit.

No per-seat pricing escalations.

No model deprecations that break your production stack overnight.

I can see a lot of companies making a real case for local LLMs here!

Especially in Europe!

Let’s look at what is actually powering that local inference.


Section 2: Under the Hood – The Engineering Breakthroughs Making This Possible

Running a trillion-parameter model on a consumer machine would have been science fiction two years ago.

What changed is not just hardware – it is a cluster of intersecting software and architectural innovations that together collapse the compute requirements by orders of magnitude without sacrificing proportional accuracy.

Understanding these is not optional if you want to make good decisions about your stack.


Mixture-of-Experts (MoE) and Sparse Activation

The single most important architectural shift in frontier open-weight models is the Mixture-of-Experts design.

Kimi K2.6 is a canonical example: it has 1 trillion total parameters but activates only 32 billion per forward pass.

The model routes each token through a learned gating mechanism that selects the most relevant subset of expert sub-networks for that specific input.

What this means practically: you get the reasoning depth and knowledge breadth of a trillion-parameter model at the inference cost of a 32B dense model.

The memory footprint during inference is determined by the active parameters and the KV cache — not the total weight size.

On hardware with unified memory (more on this shortly), this distinction is the difference between possible and impossible.

GLM-5.1 takes the opposite approach: it is a 754B dense model that activates all parameters on every call, trading inference efficiency for consistent depth across all tokens.

This is why GLM-5.1 excels at tasks requiring sustained mathematical reasoning or complex algorithm design — it brings the full model capacity to every token.

But it is considerably harder to run locally without heroic quantization.

For most local deployments, MoE-class models like Kimi K2.6 are the pragmatic choice.


Quantization-Aware Training (QAT)

Quantization is the process of representing model weights in lower numerical precision — for example, converting 16-bit floating point weights to 4-bit integers.

This shrinks the model’s memory footprint by 4× and accelerates inference because low-precision arithmetic is cheaper to compute.

The problem historically was accuracy loss: naïve quantization degrades model quality, especially at very low bit-widths like 2-bit or 4-bit.

Quantization-Aware Training (QAT) solves this by integrating weight precision reduction directly into the training process itself.

Rather than compressing a trained model after the fact (Post-Training Quantization, or PTQ), QAT exposes the model to the effects of quantization during training, allowing the model to learn weights that remain accurate under low-precision representation.

The result is a 4-bit quantized model that preserves far more of the full-precision model’s capability than PTQ can achieve — particularly important for complex reasoning chains and multi-step code generation.

A 2025/2026 advance called ZeroQAT pushes this further by eliminating the backpropagation requirement of traditional QAT entirely, using forward-only gradient estimation instead.

This reduces memory overhead so dramatically that ZeroQAT enables fine-tuning of a 13B model at 2–4 bit precision on a single 8GB GPU, and even allows fine-tuning a 6.7B model on a smartphone.

For local LLM deployment, QAT is what makes the difference between a model that fits in your Mac’s unified memory and one that doesn’t.


Delta Gated Networks and Sparse Attention Mechanisms

Beyond MoE, a class of architectural innovations broadly grouped under sparse gating mechanisms further reduces the compute and memory bandwidth required per inference step.

Delta Gated Networks (DGN) use learned sparse gates to activate only the network pathways most relevant to the current token and context, rather than propagating activations through the full model graph.

The implication for hardware like Apple Silicon is significant: inference efficiency on unified memory systems is bottlenecked not by raw FLOPS but by memory bandwidth — how fast the hardware can stream model weights from memory into compute units.

Sparse activation mechanisms reduce the effective working set of weights that need to be streamed per token, which directly translates to higher tokens-per-second on bandwidth-constrained hardware.

MiniMax M2.7’s “self-evolving” agent capabilities partially rely on this class of architectural efficiency — the model can maintain long agentic sessions with substantially lower memory pressure than equivalently performing dense models.


Flash Attention 3, Speculative Decoding, and Continuous Batching

Three additional inference optimizations work in concert with the architectural improvements above:

Flash Attention 3 rewrites the self-attention computation to avoid materializing the full attention matrix in GPU/accelerator memory, reducing memory usage for long-context inference from O(n²) to O(n).

For models with 128K–1M token context windows, this is not a minor optimization — it is what makes long-context inference on consumer hardware possible at all.

Speculative decoding uses a small draft model to predict multiple tokens ahead, with the large model verifying rather than generating.

On hardware where memory bandwidth is the bottleneck (as it is on Apple Silicon), this technique can nearly double effective throughput for sequential generation tasks.

Continuous batching allows inference servers like Ollama, vLLM, and llama.cpp to interleave requests from multiple sessions without the latency penalties of static batching.

For local agentic systems running multiple concurrent agent loops, this is what keeps the system responsive under load.

On an M5 Ultra with 256GB unified memory, a well-quantized Kimi K2.6 MoE model (4-bit, ~100–140GB on disk) running with all three optimizations can realistically sustain 35–60 tokens per second for interactive use – more than fast enough for agentic coding, writing, and research workflows.


Agentic Scale: What These Models Were Built For

One detail about Kimi K2.6 that deserves more attention: it scales horizontally to 300 sub-agents executing 4,000 coordinated steps, dynamically decomposing tasks into parallel, domain-specialized subtasks.

This is not a chatbot capability — it is a production orchestration capability that previously required the OpenAI Assistants API, a LangChain/LangGraph setup, or a managed agentic platform.

Running this locally, against a quantized model with zero marginal token cost, changes the economics of agentic AI development entirely.


Section 3: The Machine – Mac Studio M5 Ultra as the Definitive Local LLM Workstation (But Not Yet Released on Date Of Writing)

LLMs have been run on everything from a cloud A100 to a hobbyist RTX 3090 to a Mac Mini M2.

Nothing has come close to Apple Silicon for the combination of performance-per-watt, memory bandwidth, and friction-free setup that local LLM inference demands.

The Mac Studio M5 Ultra, when it ships, is going to be the machine that makes everything in the previous two sections practical for a working developer or consultant without a server rack in their office.

Here is everything we know.


Why Apple Silicon Is Uniquely Suited for Local LLMs

Most discussions of AI hardware focus on FLOPS — raw compute.

For LLM inference, this is the wrong metric. Pure and simple.

The actual bottleneck is memory bandwidth: how fast can the hardware stream model weights from memory into the compute units that process each token?

On an NVIDIA GPU, model weights sit in dedicated VRAM that is connected to the compute units via a PCIe bus — a bandwidth bottleneck that caps performance regardless of the GPU’s FLOPS ceiling.

On Apple Silicon, the CPU, GPU, and Neural Engine all share a single unified memory pool with no inter-chip bandwidth overhead.

The M5 Max already delivers approximately 614 GB/s of memory bandwidth — more than double what a high-end discrete GPU offers across a PCIe bus.

The M5 Ultra, fusing two M5 Max dies, is expected to approach or exceed 1.2 TB/s.

For quantized LLM inference, this is not just an advantage — it is a category difference.


Confirmed M5 Max Specs (As of March 2026)

Apple officially launched the M5 Max in the MacBook Pro in March 2026.

These are confirmed specs:

| Specification | M5 Max | |----|----| | CPU | 18-core (6 “super cores” + 12 performance cores) | | GPU | 32-core or 40-core, with Neural Accelerator in every core | | Max Unified Memory | 128 GB | | Memory Bandwidth | ~614 GB/s | | AI Performance vs M4 Max | Up to 4× faster | | AI Performance vs M1 | Up to 8× faster | | Connectivity | Thunderbolt 5, Wi-Fi 7 (N1 chip), Bluetooth 6 | | Default SSD | 2 TB (M5 Max), 1 TB (M5 Pro) |

Source: Apple Newsroom

The headline for AI workloads is the Neural Accelerator embedded in every GPU core — a first for Apple Silicon.

This means AI-specific matrix math can run in parallel with traditional GPU workloads at a hardware level, rather than being routed exclusively through the separate Neural Engine.


What to Expect From the M5 Ultra (Rumored/Expected)

The M5 Ultra has not been officially released at time of writing, but Apple’s UltraFusion pattern is well-established: the Ultra is two Max chips fused at the die level, doubling all the specs that can be doubled.

Based on confirmed M5 Max specs and analyst estimates:

| Specification | M5 Ultra (Expected) | |----|----| | CPU | Up to 36 cores | | GPU | Up to 80 cores | | Max Unified Memory | Up to 256 GB (down from 512 – RAM shortage-constrained) | | Memory Bandwidth (estimated) | ~1.2+ TB/s | | AI Performance vs M5 Max | ~2× (doubling of Neural Accelerator count) | | Architecture | Fusion Architecture (CPU + GPU on separate dies — configurable) |

Sources: MacRumorsMacworldTechRepublic

One notable architectural change: Apple is separating the CPU and GPU onto distinct blocks within the Fusion Architecture.

This means buyers will be able to configure different CPU/GPU ratios — a long-requested option for ML engineers who need maximum GPU cores but not the highest-tier CPU.

These specifications are analyst estimates based on Apple’s established patterns.

Treat them as directional, not confirmed.


Pricing Around the World

Current Mac Studio M3 Ultra starts at $3,999 USD.

Analysts expect a modest increase driven by two factors: rising DRAM costs (Apple removed the 512GB RAM option entirely in early 2026 and has raised prices on remaining configurations) and US tariff pressure on overseas components.

Rough estimated starting prices for the M5 Ultra Mac Studio:

| Region | Estimated Base Price (M5 Ultra) | |----|----| | USA | $4,200 – $4,500 | | UK | £3,800 – £4,100 | | EU | €4,400 – €4,700 | | India | ₹3,60,000 – ₹3,90,000 | | Australia | AUD $6,500 – $7,000 | | Singapore | SGD $5,800 – $6,200 | | Japan | ¥640,000 – ¥680,000 |

These are estimates based on current M3 Ultra pricing plus analyst projections.

Sources: MacworldTechRepublic price analysis


Release Timeline: When to Expect It

As of April 2026:

  • Most likely window:
  • WWDC, June 8, 2026.
  • Apple used WWDC to launch the M2 Mac Studio in June 2023.
  • Internal code in macOS Tahoe points to a Studio update in summer 2026.
  • Bloomberg’s Mark Gurman has revised his estimate to “middle of the year.”
  • Fallback window:
  • October–November 2026, if supply chain snags (flagged by Gurman on April 19, 2026) worsen.
  • Current availability pain:
  • As of April 2026, Mac Studio configurations with 128GB and 256GB RAM are out of stock on Apple’s US storefront.
  • Delivery estimates for available configurations range from 3–12 weeks depending on configuration.

Source: MacworldMacRumors recap


Wait or Buy Now? A Structured Decision

Wait for M5 Ultra if:

  • You are on an Intel Mac or M1 generation and planning a major upgrade
  • Your primary use case is local LLM inference, ML training, or 3D rendering
  • You can absorb 3–6 months of continued API costs without it breaking your business
  • You want the Neural Accelerator per GPU core advantage for sustained agentic workloads

Buy M4 Max now if:

  • You have an active project that is blocked today for lack of local inference capacity
  • You are currently on an M2 or older M3 Mac Studio and the upgrade is already substantial
  • Your primary workloads are not inference-bandwidth-constrained (e.g., general development, content creation, light model testing)

For pure local LLM use, the M5 Ultra is architecturally superior in ways that will matter for 3–5 years.

But the M4 Max is not slow — it is already exceptional.

The decision is about how long you are willing to wait and how much the current API bill is costing you.


A Note on the Global RAM Shortage

This deserves its own paragraph because it affects buying decisions across the board.

AI hyperscalers — Microsoft, Google, Amazon, Meta — are consuming memory at a rate that is crowding out consumer and prosumer supply.

Apple has already removed the 512GB RAM option from the Mac Studio and raised prices on remaining configurations.

The 256GB ceiling on the M5 Ultra is likely not a design decision — it is a supply constraint.

If you are planning a local AI workstation in 2026, assume high-memory configurations will be constrained, hard to order, and premium-priced.

Plan your architecture around whatever is actually available, not the theoretical maximum spec sheet.


Section 4: Your Local AI Brain – Setting Up a Fully Agentic System With OpenClaw

Having the right model and the right hardware is half the equation.

The other half is the orchestration layer — the software that turns a capable LLM into a system that actually does things in the world autonomously, without requiring you to babysit every task.

OpenClaw is a rough-edged, security nightmarish, vibe-coded disaster, but working agentic AI layer.

What began as a personal side project called Clawdbot by Austrian developer Peter Steinberger in November 2025 has become — genuinely, measurably — the most starred repository in GitHub history, hitting 347,000 stars by April 2026.

It is model-agnostic, self-hosted, privacy-first, and built around a skills-based plugin architecture that lets you compose almost any workflow you can describe in natural language.

This section gives you a setup path and ten concrete workflows to prove the system works.


Why Not OpenFang? The Honest Answer

Before we go further, I want to address an alternative: OpenFang, which markets itself as an “Agent Operating System” rather than an agent framework.

Written in Rust, it is architecturally ambitious: 7 autonomous “Hands,” 53 tools, 40 messaging channels, 1,767+ tests, WASM-sandboxed tool execution, cryptographic audit chains, and taint tracking for secrets.

On paper, it reads like the future.

In practice, it is pre-1.0.

The project itself states openly: “Breaking changes may occur between minor versions until v1.0. Pin to a specific commit for production deployments.”

The architecture is solid and the security model is genuinely ahead of OpenClaw’s.

But the entry barrier is high — too many bugs with the GUI cripple the ease of use.

Configuration is terminal-based and complex, and the ecosystem of community skills and integrations is nowhere near OpenClaw’s maturity.

The goal is a stable v1.0 by mid-2026, at which point this integral equation may change.

For now: if you have a Rust-proficient team and need an agent OS with deep security guarantees for a production enterprise deployment, keep OpenFang on your radar.

**For everyone else who wants to run a productive local agent today, OpenClaw is the pragmatic choice.\But keep it sandboxed, no personal data access and strictly no critical data!*Your API Keys and the vast majority of your credentials are stored as plaintext! ***

***Prompt injection is a piece of cake. ** \ Officially – OpenClaw is a hacker’s dream.* \n **Or even more scarily – a Russian / Chinese / Israeli / Korean state sponsored hacker team’s attack playground dream come true! \ *If you are not yet aware, wake up now!*

Source: OpenFang GitHubtill-freitag.com OpenClaw overview


Setting Up a Sandboxed But Fully Functional OpenClaw

The default OpenClaw installation gives the agent full host access when running your personal main session.

That is powerful and also where people get into trouble.

The setup below gives you a system that is functionally complete — including internet access — but architecturally isolated.

Step 1: Create an Isolated Agent User

sudo useradd -m ai-agentsudo passwd ai-agent

The ai-agent user gets no access to your personal home directory.

All agent operations are scoped to ~ai-agent/workspace.

This single step eliminates the most common attack surface: a compromised or misbehaving skill reading or writing your personal files.

Step 2: Dockerize the OpenClaw Gateway

docker run -d \  --name openclaw-gateway \  --user ai-agent \  -v ~/agent-workspace:/home/ai-agent/.openclaw/workspace:rw \  -v ~/agent-docs:/home/ai-agent/docs:ro \  -p 3000:3000 \  openclaw/openclaw:latest

Mount only what the agent needs: a writable workspace volume and read-only document mounts for knowledge bases.

Never mount your home directory or any path containing credentials, SSH keys, or browser profiles.

Step 3: Connect to a Local LLM via Ollama

Edit ~/.openclaw/openclaw.json:

{  "model": "ollama/kimi-k2.6-q4_k_m",  "agents": {    "defaults": {      "maxTokens": 8192,      "sandbox": {        "mode": "non-main"      }    }  }}

If Kimi K2.6 quantized weights are not yet available in the Ollama registry at your time of reading, the Gemma 4 27B MoE (4-bit quantized) is an excellent substitute, scoring 85.5% on the τ2-bench agentic tool use benchmark.

Step 4: Grant Internet Access via a Dedicated Sandbox

For workflows that need web access, run a lightweight Docker container with a restricted browser tool and scoped egress:

docker run -d \  --name agent-browser-sandbox \  --network agent-net \  --dns-search allowlist.internal \  openclaw/browser-sandbox:latest

Configure OpenClaw to route browser tool calls through this container.

The agent gets internet access; your host machine’s network stack stays isolated.

Step 5: Audit Every Skill Before Installing

Skills are the lifeblood of OpenClaw — and the primary attack surface.

Before installing any community skill from ClawHub:

cat ~/.openclaw/workspace/skills/<skill-name>/metadata.json | jq '.permissions'

If a skill requests shell.execute or fs.read_root without an obvious reason tied to its core function, do not install it.

Cisco’s AI security team has documented data exfiltration via malicious third-party skills.

The skill repository does not vet submissions the way an app store does.

Your threat model is real.

Source: AlphaTechFinance OpenClaw guideDigitalOcean OpenClaw


10 Viral Use Cases for an Isolated OpenClaw

These are production-viable workflows, not demos.

Each has been documented by real users in the OpenClaw community.

With internet access scoped to a sandboxed browser, all ten remain fully operational.

Use Case 1: Overnight Coding Agent

  • Configure OpenClaw with a GitHub skill (scoped to specific repositories) and schedule a nightly cron task.
  • The agent picks up your issue queue, writes branches for well-defined tickets, runs your test suite inside the Docker sandbox, and opens draft PRs with test results attached.
  • Wake up to reviewable code, not a blank morning.

Use Case 2: Competitive Intelligence Monitor

  • Give the agent a list of competitor domains and a sandboxed browser.
  • Set a daily cron task.
  • Each morning, it checks pricing pages, job listings (signal for product roadmap), blog posts, and GitHub repos — and delivers a 300-word Telegram digest with delta from yesterday.
  • No Crunchbase subscription required.

Use Case 3: Automated SEO Audit Pipeline

  • Mount your Google Search Console CSV exports as a read-only volume.
  • The agent parses crawl errors, broken internal links, canonical mismatches, and missing meta descriptions — then generates a weekly prioritized Markdown report to your Obsidian vault.
  • Reproducible, documented, and free.

Use Case 4: Personal Knowledge Base Q&A

  • Mount your Obsidian vault or local Notion export as read-only.
  • On first run, the agent builds a vector index using a local embedding model.
  • Thereafter, message it on WhatsApp: “What did I write about the trade-offs between MoE and dense architectures?” — and get a synthesized answer from your own notes in seconds.

Use Case 5: Invoice and Expense Processing

  • Drop vendor PDFs and receipts into a watched folder.
  • The agent extracts line items, categorizes by project code, updates a local spreadsheet, flags anomalies against your budget, and archives the originals.
  • No data leaves your machine.
  • Your accountant gets a clean export.
  • This one saves hours a month.

Use Case 6: Multi-Platform Content Repurposing

  • Feed the agent a long-form article draft.
  • Using a SOUL.md persona file that encodes your voice, tone preferences, and brand guidelines, it autonomously produces: a Twitter/X thread, a LinkedIn carousel outline, a HackerNoon teaser intro, and a newsletter summary.
  • Four assets from one input, all in your voice.

Use Case 7: Daily Research Digest

  • Using the sandboxed browser, the agent fetches new papers from arXiv (filtered by your keyword list), the top Hacker News discussions, and your RSS feeds — every morning before you wake up.
  • It synthesizes a personalized 500-word briefing and sends it to Signal.
  • You start every day informed.
  • This is incredibly useful!

Use Case 8: Pre-Push Code Review

  • Hook OpenClaw into your git pre-push hook via a shell script.
  • Before every push, the agent receives the diff, evaluates it against your coding-standards.md file, and returns a structured review: style violations, potential bugs, missing tests, and a pass/fail recommendation.
  • Sub-200ms on M5 Max class hardware.

Use Case 9: Automated Meeting Prep

  • Connect the agent to your calendar (read-only).
  • Thirty minutes before every external meeting, it researches all attendees via the sandboxed browser (LinkedIn, company site, recent press), pulls relevant email threads if the email skill is enabled (scoped to the specific contact), and delivers a one-page brief to your Telegram.
  • You walk into every meeting prepared.
  • People will pay for this!

Use Case 10: Autonomous Newsletter Production

  • Define a topic list, a publication schedule, and a quality bar.
  • The agent researches each topic, drafts the article, self-edits against your style guide, and queues the post as a draft in your Ghost or Substack account.
  • You review and publish.
  • The research-to-draft cycle runs entirely on local inference.
  • Zero marginal token cost.

Sources: DigitalOcean OpenClawClawbot.blog April 2026 updateKDnuggets OpenClaw explainer


Section 5: The Urgency Is Real – What Andrej Karpathy’s Last Three Projects Tell You About the Future of Work

I want to shift into another register for a moment and talk about something that does not show up in benchmark tables: the pace of change itself.

The cost argument is compelling.

The hardware argument is concrete.

But the reason I am writing this article with genuine urgency is not the numbers — it is the signal I take from watching where the smartest people in this field are spending their time.

Nobody has a better read on where the frontier is actually heading than Andrej Karpathy.

And his last three major projects form a pattern that I think every professional who depends on knowledge work needs to understand.


Project 1: “2025 LLM Year in Review” — The Paradigm Has Already Shifted

In December 2025, Karpathy published what is effectively a field report on six paradigm shifts that collectively rewired the LLM landscape over the course of a single year.

I recommend reading the original in full.

Here are the two claims that I think have the most direct implications for the working professional.

Claim 1: RLVR has replaced the stable three-stage training stack.

  • Prior to 2025, the production recipe for frontier LLMs was settled: pretraining → supervised fine-tuning → RLHF.
  • Reinforcement Learning from Verifiable Rewards (RLVR) upended this.
  • By training models against objective, automatically verifiable reward functions — mathematics and code — RLVR forces models to develop genuine reasoning traces rather than learned response patterns.
  • The models that win on hard reasoning benchmarks in 2026 are RLVR-trained.
  • This is why the benchmark gap between open-weight and proprietary models has collapsed so quickly: RLVR is a training recipe anyone with compute and data can run.

Claim 2: We have exploited less than 10% of this paradigm’s potential.

  • Karpathy was direct: the industry has barely scratched the surface of what RLVR-trained models can do in long-horizon reasoning and agentic operation.
  • The models available today — as impressive as they are — are not the plateau.
  • They are the floor.
  • Unbelievable!

Source: Karpathy’s Bear BlogMLOps Substack analysis


Project 2: Eureka Labs — Rebuilding Education From the Ground Up

  • In July 2024, Karpathy announced Eureka Labs: an AI-native school with a mission to provide every learner with the equivalent of a deeply knowledgeable, infinitely patient personal tutor.
  • The first product is LLM101n — an undergraduate-level course guiding students through training their own AI, with the AI teaching assistant itself modeled on Feynman-style pedagogical depth.
  • The implication here is not just educational. It is about the compression of reskilling timelines.
  • The traditional path from “I don’t understand transformers” to “I can build, fine-tune, and deploy an LLM” took years of graduate coursework or expensive bootcamps.
  • Eureka Labs is explicitly designed to collapse that to weeks for a motivated learner.
  • What this signals for working professionals: the window between “early adopters know this” and “everyone knows this” is shrinking.
  • AI fluency — the ability to evaluate, orchestrate, and deploy LLM systems — is transitioning from a specialized skill to a baseline expectation.
  • The education infrastructure to achieve this is now being built at scale.

Source: Silicon Republic


Project 3: LLM Knowledge Bases — The “Second Brain” Wiki

  • Using LLMs to compile unstructured data into an active, self-updating personal wiki—typically visualized through Obsidian—is proof that personal knowledge management has moved beyond traditional, static Retrieval-Augmented Generation (RAG) systems.
  • It requires only a willingness to let AI orchestrate your scattered research papers, Python scripts, and n8n workflow logs.
  • This methodology has produced a generation of self-augmenting researchers who interact dynamically with accumulated knowledge rather than starting from scratch.
  • This compounding architecture proves that maintaining a localized, ever-evolving intellectual ecosystem is now fully automated and structurally accessible.
  • An AI agent actively cleans, links, and updates the knowledge base in the background.
  • The developers synthesizing insights on models like MiniMax, building complex agentic workflows, and executing digital syndication strategies can rely on these dynamic, AI-curated wikis to turn fragmented data into compounding leverage.
  • Some have gone as far as to call this system the end of the line for RAG – as extreme as it seems!

The Professional Stakes: Let Me Be Direct

The 2026 knowledge worker who cannot evaluate, orchestrate, or deploy LLMs is in the same structural position as the 1996 knowledge worker who could not use email.

Not “behind the curve” — actively disadvantaged in ways that compound over time.

In software engineering specifically, the benchmark gap between proprietary and open-weight models has effectively closed for coding tasks.

Teams that cannot reason about which model to use for which task, how to run it locally, how to orchestrate it in an agentic loop, and how to evaluate its outputs are not just leaving money on the table — they are building technical debt into their competitive position that will be extremely expensive to unwind.

The good news: the resources to close this gap are free, the hardware is accessible, and the ecosystem tools like OpenClaw make the practical side tractable without infrastructure expertise.

The only thing this requires is the decision to start.


Section 6: The Next Six Months – A Forecast for Local AI That Exceeds Claude

cl

Forecasting in AI is humbling.

The pace of development has embarrassed almost every prediction made since 2022, and the models available three months from now will almost certainly make some of what I have written here feel conservative.

That said, the signals are clear enough that an honest probability-weighted view is worth making explicit.


What Has Already Happened (April 2026 Baseline)

Before forecasting forward, let’s be precise about where we are. As of today:

  • The benchmark gap between open-weight and proprietary frontier models has effectively closed for coding tasks.
  • MiniMax M2.5 achieving 80.2% on SWE-Bench Verified vs. Claude Opus 4.6 at 80.8% is not a gap that meaningfully affects most production use cases.
  • Open-weight models under MIT/Apache 2.0 licenses now cost 10–100× less per token than proprietary APIs for self-hosted deployments.
  • The tooling layer (OpenClaw, OpenFang (getting there), vLLM, Ollama, llama.cpp) has matured to the point where local deployment requires no infrastructure expertise for single-machine setups.
  • Apple Silicon (M5 Max, incoming M5 Ultra) delivers memory bandwidth that makes quantized 70B+ MoE models viable on a desktop machine that fits on your desk, draws less than 300W, and costs under $5,000.

This is the baseline.

Everything I forecast below is incremental from here.


3–6 Month Probability-Weighted Forecast

High Probability (>80%): The Sub-$5,000 Local Frontier Workstation Becomes Standard

  • A 4-bit quantized Kimi K2.6 or its successor — running on an M5 Ultra with 192GB unified memory — at 40–60 tokens per second becomes the default local coding and research assistant for serious individual developers and small teams.
  • The Mac Studio M5 Ultra ships (most likely at WWDC June 2026 or in fall), and within 30 days of release the Ollama registry has compatible quantized weights available.
  • For developers who act on this, the API bill goes to near-zero.

Medium Probability (50–70%): Sub-20B Active Parameter MoE Models Match Current Claude Sonnet

  • The MiniMax M2.7 trajectory — 10B active parameters, strong benchmark performance, self-evolving agent capabilities — points toward a class of models where the “activated parameter budget” for Claude-Sonnet-equivalent performance drops below 15B.
  • At that level, inference is viable on an M3 MacBook Pro with 36GB unified memory.
  • The local AI workstation goes from “Mac Studio required” to “your existing laptop if it has enough RAM.”
  • The timeline for this is 3–6 months based on the current rate of open-weight model releases.
  • I find this the most exciting forecast of all three.
  • Because It. Will. Democratize. Generative. AI. And. Take. The. Power. To. The. Average. AI. Developer.
  • Woo-hoo!

Speculative But Plausible (<40%): RLVR-Trained Open-Weight Model Achieves Claude Opus-Level General Reasoning

  • The benchmark gap on general reasoning — not just coding — remains meaningful.
  • Claude Opus 4.6 and 4.7 retain advantages in nuanced writing, complex multi-domain reasoning, and tasks requiring sustained judgment across long contexts.
  • Closing this gap requires not just better base models but better RLVR training with broader reward signal coverage.
  • The academic infrastructure for this is being built rapidly.
  • Whether a 3–6 month timeline is achievable for an open-weight model to genuinely rival Claude Opus on general tasks is uncertain.
  • But it is no longer technically implausible.

What Does Not Change: The Importance of the Human in the Loop

One thing I am confident will not change in 6 months: the outputs of local LLMs — like all LLMs — require informed human evaluation to be useful in production.

The value of running local AI is not that it eliminates judgment; it is that it dramatically accelerates the drafting, research, and iteration cycles that precede judgment.

The developers, consultants, and knowledge workers who will benefit most are those who develop strong mental models for when to trust LLM output, when to verify it, and when to override it entirely.

This is, ultimately, a skill — and like all skills, it compounds.

The practitioners building this muscle today will be operating at a different level in 18 months than those who waited.


My Personal Call to Action

You do not need to wait for M5 Ultra to start.

If you have an M-series Mac with 24GB or more of unified memory, you can run a capable quantized model via Ollama today.

The setup is an afternoon’s work.

OpenClaw can be running against a local model by EOD.

Pick one workflow from Section 4’s use case list.

The one that currently costs you the most time or money.

Set it up. Run it for two weeks.

Measure the output quality against your current baseline.

If the output is not good enough, you have learned exactly what to tune.

If it is good enough — and for most professional workflows, it will be — you have just eliminated a recurring cost and gained a system that works for you without supervision, at zero marginal token cost, with all your data staying on your machine.

The infrastructure for a different relationship with AI is already here.

The only variable is whether you choose to build it.

There has never been a better time to be an AI Consultant!

Cheers!

References

  1. Artificial Analysis — Kimi K2.6 Intelligence & Performance Analysis: https://artificialanalysis.ai/models/kimi-k2-6
  2. Artificial Analysis — Kimi K2.6: The New Leading Open-Weights Model: https://artificialanalysis.ai/articles/kimi-k2-6-the-new-leading-open-weights-model
  3. Atlas Cloud Blog — Kimi K2.6 vs GLM-5.1 vs Qwen 3.6 Plus vs MiniMax M2.7 Coding 2026: https://www.atlascloud.ai/blog/guides/kimi-k2-6-vs-glm-5-1-vs-qwen-3-6-plus-vs-minimax-m2-7-coding-2026
  4. AIMadeTools — GLM-5.1 vs Kimi K2.6 Comparison: https://www.aimadetools.com/blog/glm-5-1-vs-kimi-k2-6/
  5. TokenMix — Best Chinese AI Models 2026 Comparison Guide: https://tokenmix.ai/blog/best-chinese-ai-models-2026-comparison-guide
  6. iternal.ai — LLM Benchmarks 2026: 30+ Models Ranked: https://iternal.ai/llm-selection-guide
  7. AkitaOnRails — LLM Coding Benchmark April 2026: https://akitaonrails.com/en/2026/04/24/llm-benchmarks-parte-3-deepseek-kimi-mimo/
  8. llm-stats.com — Kimi K2.6 Pricing, Benchmarks & Performance: https://llm-stats.com/models/kimi-k2-6
  9. Apple Newsroom — MacBook Pro with M5 Pro and M5 Max (March 2026): https://www.apple.com/newsroom/2026/03/apple-introduces-macbook-pro-with-all-new-m5-pro-and-m5-max/
  10. Macworld — Mac Studio 2026: M5 Max & Ultra Release Date, Price, Specs: https://www.macworld.com/article/2973459/2026-mac-studio-m5-release-date-specs-price-rumors.html
  11. MacRumors — M5 Ultra Chip Coming to Mac Studio in 2026: https://www.macrumors.com/2025/11/04/mac-studio-m5-ultra-2026/
  12. MacRumors — Mac Studio Rumor Recap April 2026: https://www.macrumors.com/2026/04/17/mac-studio-rumor-recap-april/
  13. TechRepublic — Mac Studio 2026 M5 Max Ultra Release Date: https://www.techrepublic.com/article/news-apple-mac-studio-m5-max-ultra-2026-release-date/
  14. TechRepublic — Mac Studio 2026 M5 Price & Release Timeline: https://www.techrepublic.com/article/news-mac-studio-2026-m5-price-release-timeline/
  15. ZEERA Wireless — Apple M5 Mac Studio 2026 Rumors: https://zeerawireless.com/blogs/news/apple-m5-mac-studio-2026-rumors-june-release-date-m5-ultra-256gb-ram-limit
  16. Wikipedia — OpenClaw: https://en.wikipedia.org/wiki/OpenClaw
  17. KDnuggets — OpenClaw Explained: https://www.kdnuggets.com/openclaw-explained-the-free-ai-agent-tool-going-viral-already-in-2026
  18. Clawbot.blog — OpenClaw: The Rise of an Open-Source AI Agent Framework (April 2026): https://www.clawbot.blog/blog/openclaw-the-rise-of-an-open-source-ai-agent-framework-april-2026-update/
  19. DigitalOcean — What Is OpenClaw?: https://www.digitalocean.com/resources/articles/what-is-openclaw
  20. AlphaTechFinance — OpenClaw Complete 2026 Guide: https://alphatechfinance.com/productivity-app/openclaw-ai-agent-2026-guide/
  21. OpenFang GitHub Repository: https://github.com/RightNow-AI/openfang
  22. OpenClaw GitHub Repository: https://github.com/openclaw/openclaw
  23. Till Freitag — What Is OpenClaw? (EN): https://till-freitag.com/en/blog/what-is-openclaw-en
  24. Lushbinary — OpenClaw + Gemma 4 Setup Guide 2026: https://lushbinary.com/blog/openclaw-gemma-4-local-ai-agent-ollama-setup-guide-2026/
  25. NVIDIA NemoClaw: https://www.nvidia.com/en-us/ai/nemoclaw/
  26. Andrej Karpathy — 2025 LLM Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/
  27. MLOps Substack — 2025 LLM Year in Review from Andrej Karpathy: https://mlops.substack.com/p/2025-llm-year-in-review-from-andrej
  28. Silicon Republic — Andrej Karpathy Unveils Eureka Labs: https://www.siliconrepublic.com/machines/andrej-karpathy-eureka-labs-ai-startup-education-platform-llm101n
  29. Karpathy.ai — Neural Networks: Zero to Hero: https://karpathy.ai/zero-to-hero.html
  30. IBM Think — What Is Quantization-Aware Training (QAT)?: https://www.ibm.com/think/topics/quantization-aware-training
  31. arXiv — ZeroQAT: End-to-End On-Device QAT for LLMs at Inference Cost: https://arxiv.org/html/2509.00031v2

:::info All Images are AI-Generated by Nano Banana 2.

:::

:::info Claude Sonnet 4.6 was used in the first draft of this article.

:::

\