MoreRSS

site iconExponential ViewModify

By Azeem Azhar, an expert on artificial intelligence and exponential technologies.
Please copy the RSS to your reader, or quickly subscribe to:

Inoreader Feedly Follow Feedbin Local Reader

Rss preview of Blog of Exponential View

🟣 EV Daily: US doubles down on data and energy

2025-07-16 17:53:18

🏭 Lead story: US doubles down on data and energy

Donald Trump and Senator Dave McCormick unveiled a $70 billion-plus package of private commitments to build and power a crop of new data centres in Pennsylvania and Ohio. Google will pour $25 billion into new data-centre capacity across the Pennsylvania-Jersey-Maryland grid, underwritten by a $3 billion, 670MW hydropower deal with Brookfield, an asset manager, that can scale to 3GW. Blackstone matched the bet with a $25 billion plan to co-locate data centres and gas generation, while CoreWeave pledged $6 billion for an AI-specific campus outside Lancaster. This is industrial policy by other means. Cheap, controllable power—rather than clever code—has become the decisive input for frontier models, and partisan politics is rushing to supply it. By bundling electrons, real estate and job guarantees into a single narrative, Republicans are positioning energy sovereignty as the new logic board of national AI advantage. [Semafor]

Subscribe now

Key signals, quick scan

A 30-second scan of four secondary signals that hint at where the curve is bending.

  • 🌏After months of uncertainty, US regulators have again cleared Nvidia’s downgraded H20 GPU for export, reopening a pipeline worth billions. The flip-flop shows policy is fluid—and that hardware downgrades remain a viable workaround to access China’s vast AI spend. [FT]

  • 🧪An AI-powered laboratory can run itself and discover materials 10 times faster than the best human researchers. The dynamic-flow technique lets autonomous materials-discovery rigs collect a data point every 0.5 seconds instead of one per experiment—unlocking just-in-time discovery for battery, catalyst and semiconductor R&D. [ScienceDirect]

  • 👩‍💻OpenAI is developing features that let ChatGPT quickly create and edit presentations and spreadsheets directly compatible with PowerPoint and Excel. Microsoft’s old friend is now a direct rival. [The Information]

  • 🪖DJI ships millions of drones a year—outpacing the entire US industry (a 500-company-strong sector) by more than 20x. It gives Beijing a scale and cost advantage in both civilian and dual-use drone tech just as unmanned systems enter mainstream logistics and warfare. [NYT]

    Share

Future of work: Optimising your AI collaboration

When should an AI decide on its own, and when should a human see its confidence score? MIT economists put 3,500 volunteers through a fact-checking exercise to find out.

They modeled V(x)—the proportion of correct human decisions when shown an AI confidence score of x%—and used it to evaluate five hand-off policies. The winner was a two tier rule:

  1. Auto-accept AI answers above a high-confidence threshold (say 90%)

  2. Show the exact percentage for everything else and let the human decide.

This nudged accuracy from 78 % to 80.5 %, a 2.5-point gain over full human review with total AI transparency. The worst approach—hiding the AI score entirely—lagged by seven points.

The winning policy tackles a key challenge: humans often fail to trust even highly confident AI predictions, leading to suboptimal results. By estimating your own V(x) curve for specific tasks, you can set smarter thresholds: automate ultra-confident outputs and present exact scores where human judgment adds the most value.

Share

🔮 Kimi K2 is the model that should worry Silicon Valley

2025-07-16 03:57:18

In October 1957, Sputnik 1 proved that the USSR could breach Earth’s gravity well and shattered Western assumptions of technological primacy. Four years later, Vostok 1 carried Yuri Gagarin on a single loop around the Earth, confirming that Sputnik was no fluke and that Moscow’s program was accelerating.

In today’s AI, DeepSeek plays the Sputnik role – as we called it in December 2024 – as an unexpectedly capable Chinese open‑source model that demonstrated a serious technical breakthrough.

Now AI has its Vostok 1 moment. Chinese startup Moonshot’s Kimi K2 model is cheap, high-performing and open-source. For American AI companies, the frontier is no longer theirs alone.

Vostok 1 launch. Source: Wikipedia

In today’s analysis, we’ll get you up to speed on Kimi K2, including:

  1. What Kimi K2 is and how it works – its architecture, optimizer and training process, and how it was developed inexpensively and reliably on export-controlled chips.

  2. Why Kimi K2 matters strategically – how it shifts the centre of AI gravity, particularly on efficiency, and why it’s a wake-up call for US incumbents.

  3. What comes next – the implications for open-source versus closed-source, AGI strategy, and China’s growing AI advantage.

Subscribe now

What’s so special about Kimi K2?

First off, it’s not a Kardashian 😆. But it is engineered for mass attention. Only here, the mechanism is literal. Like DeepSeek, Kimi K2 uses a mixture-of-experts (MoE) architecture, a technique that lets it be both powerful and efficient. Instead of processing every input with the entire model (which is slow and costly), MoE allows the model to route each task to a small group of specialized “experts.” Think of it like calling in just the right specialists for a job, rather than using a full team every time.

K2 packs one trillion parameters, the largest for an open-source model to date. It routes most of that capacity through 384 experts, of which eight – roughly 32 billion parameters – activate for each query. Each expert hones a niche. This setup speeds the initial pass over text while regaining depth through selective expert activation to deliver top‑tier performance at a fraction of the compute cost.

But Kimi K2 didn’t start from scratch. It built directly on DeepSeek’s open architecture.

One of the most beautiful curves in ML history

It’s a textbook case of the open innovation feedback loop, where each release seeds the next and shared designs accelerate the whole field. That loop let Kimi K2 focus on the next innovation: its approach to training.

Training a large language model is like adjusting millions of tiny knobs – each one a parameter that nudges the model toward fewer mistakes. The optimizer decides how large each adjustment should be. The industry standard, AdamW, updates each parameter based on recent trends and gently nudges it back toward zero. But at massive scale, this can go haywire. Loss spikes – sudden jumps in error – can derail training and force costly restarts.

Moonshot’s MuonClip model introduces two innovations to improve the training and stability of AI systems.

First, it adds “second-order” insight, meaning it doesn’t just look at how the model is learning (via gradients), but also how those gradients themselves are changing. This helps the model make sharper, more stable updates during training to improve both speed and reliability.

Second, it adds a safety mechanism called QK-clipping to the attention mechanism. Normally, when the model calculates how words relate to each other (by multiplying ‘query’ and ‘key’ weights), those values can sometimes become too large and destabilize the system. QK-clipping caps those scores before they spiral out of control, acting like a circuit-breaker to keep the model focused and stable.

The result is “[o]ne of the most beautiful loss curves in ML history,” as AI researcher Cedric Chee put it. Training runs longer, it is more reliable and at an unprecedented scale for open-source.

Loss curve for Kimi K2 during pretraining on 15.5 trillion tokens using the MuonClip optimizer. The smooth, downward trajectory—with no instability spikes or plateaus—demonstrates MuonClip’s ability to maintain stable, large-scale LLM training across trillions of tokens.

This would have unlocked massive compute savings. Research from earlier this year estimates that Muon optimizers are roughly twice as computationally efficient as AdamW. This would have been a major help with export controls. Moonshot likely had to train K2 on compliant A800 and H800 hardware instead of the flagship H100s. The training ran on more than 15.5 trillion tokens, roughly 50x GPT‑3’s intake, without a single loss spike, catastrophic crash or reset. Given this, training was likely relatively inexpensive. It probably cost in the low tens of millions of dollars.

Beyond its architecture and optimizer, Kimi K2 was trained with agentic capabilities in mind. Moonshot built simulated domains filled with real and imaginary tools, then let competing agents solve tasks within them. An LLM judge scored the outcomes, retaining only the most effective examples. This taught K2 when to act, pause, or delegate. Even without a chain-of-thought layer, where the model generates intermediate reasoning steps before answering, the public Kimi‑K2‑Instruct checkpoint performs impressively on tool use, agentic, and STEM-focused benchmarks, matching or exceeding GPT-4.1 and Claude 4 Sonnet. Quite differently, It also ranks as the best short‑story writer.

Kimi K2 performance on a set of benchmarks. Source

Artificial Analysis notes that Kimi K2 is noticeably more verbose than other non-reasoning models like GPT‑4o and GPT‑4.1. In their classification it sits between reasoning and non‑reasoning models. Its token usage is up to 30 % lower than Claude 4 Sonnet and Opus in maximum‑budget extended‑thinking modes but nearly triple that of both models when reasoning is disabled.

Still, it currently doesn’t leverage chain-of-thought reasoning. Moonshot will likely release a model which adds this in the future. If it mirrors DeepSeek’s leap from V3 to R1, it could close the gap with closed-source giants on multi-step reasoning and potentially become the best overall model. But that’s not guaranteed.

Subscribe now

Even with its verbosity, pricing remains one of Kimi K2’s key strengths. Moonshot has taken DeepSeek’s foundation and improved it across the board, pushing out the price‑performance frontier. The public API lists rates at $0.15 per million input tokens and $2.50 per million output tokens. This makes it 30% cheaper than Gemini 2.5 Flash on outputs, and more than an order of magnitude cheaper than Claude 4 Opus ($15 in / $75 out), GPT‑4o ($2.5 in / $10 out), or GPT‑4.1 ($2 in / $8 out).

However, in practice, K2’s higher token output makes it more expensive to run than other open-weight models like DeepSeek V3 and Qwen3, even though it significantly outperforms them.

Still, it sits right on the edge of the cost-performance frontier, as it delivers near-frontier capability on agentic and coding tasks at unit economics that make sense for everyday product workloads. And those economics improve further if you run it on your own hardware.

As I said in my introduction, Kimi K2 is China’s Vostok 1 moment: it is proof that China can not only match but push forward the state of the art under real-world constraints. And like Vostok, what matters most isn’t just the launch – but the chain reaction it sets off.

Why Kimi K2 matters

Within weeks of Gagarin’s Vostok flight, the US scrambled to close the gap. Alan Shepard’s 15‑minute Freedom 7 hop on 5 May 1961 put the first American in space. Just twenty ddays later, President John F. Kennedy asked Congress to commit the nation to landing a man on the Moon before 1970.

K2 has now confirmed that China leads in AI‑efficiency innovations. DeepSeek R1 proved that you could graft full chain‑of‑thought reasoning onto a sparse mixture‑of‑experts model simply by inventing a new reinforcement objective – Group‑Relative PPO – and reducing the reliance on expensive human‑written dialogues.

Last week, Kimi K2 repeated the trick on the training side: the MuonClip optimizer keeps the gradient so well behaved that a trillion‑parameter MoE can process 15.5 trillion tokens without a single loss spike while using about half the FLOPs of AdamW. Two genuine algorithmic advances, six months apart, both published under permissive licenses, have shifted the centre of gravity for efficiency innovation to Beijing rather than Palo Alto.

The next question is whether that shift actually matters.

Read more

🟣 EV Daily: The Pentagon goes all-in on AI

2025-07-15 17:29:02

Lead story: The Pentagon goes all-in on AI

The Pentagon’s Chief Digital & AI Office has written three identical $200 million checks—one each to Anthropic, Google, and Elon Musk’s xAI—to push agentic AI from lab to battlefield, only weeks after giving OpenAI the same mandate. This is the state seeding a domestic oligopoly of frontier-model suppliers. It’s not a winner-takes-all bet, but a portfolio hedge that locks talent and compute inside US security perimeters while forcing the giants to prove safety, auditability and interoperability on DoD terms. In the exponential age, power accrues to whoever sets the technical and ethical specs for these agents; America is racing to codify those specs before rivals do. [BreakingDefense] #AIModels

Subscribe now

Key signals, quick scan

A 30-second scan of four secondary signals that hint at where the curve is bending.

  • Meta will switch on its Prometheus (Ohio) data center next year and break ground on Hyperion (Louisiana) soon after. Both will provide several ­gigawatts of load, rivalling small nuclear plants and confirming that hyperscalers are leapfrogging 100-MW builds in favour of giga-campuses to feed next-gen “super-intelligence” models. And they’re vast—Prometheus will be almost as big as Manhattan. [The Information] #Infrastructure

  • Xiaomi is taking on Tesla in China with a highly automated EV factory employing 1,000 robots. The company’s manufacturing approach, which includes replacing a 72-part process that has 840 weld points with a single die-cast element, allows for rapid production and means Xiaomi will soon challenge Tesla’s competitive pricing. [KrAsia] #Hardware #Robotics

  • China has approved Synopsys’s $35 billion acquisition of Ansys — the biggest deal ever in electronic design automation, the software used to design advanced computer chips. With this approval, two US companies will control nearly the entire global market for high-end chip design software. A supply chain already strained by export controls now hinges on even fewer tool providers. [Reuters] #Infrastructure

  • Meta’s new superintelligence lab has reportedly considered abandoning its most powerful open-source model, Behemoth, in favor of a closed alternative. If it does, it all but confirms that China now leads the open-weight frontier. [NYT] #Models

    Share

Tool of the week – Building an AI assistant

Last week, we built a no-code automation that helps you triage inbox chaos, handle urgent requests and shield your focus time - without hiring a virtual assistant.

The assistant works like this:

  • It reviews every incoming email and classifies it by urgency and importance.

  • It pings you on Slack if something genuinely urgent appears.

  • You can reply right there—or ask it to block out a 30-minute focus slot in your calendar.

We used Lindy, a tool designed for chaining AI steps together (alternatives include n8n, Make, or Wordware). But the key takeaway isn’t the tool – it’s the workflow: lightweight and built to defend your time.

🔧 Here’s the tutorial to build your own in under 10 minutes.

All annual members of Exponential View get an extended free access to Lindy – claim your AI bundle here.

Share


Was this email useful?

Thanks! Want to share this with someone like you? Forward the email.


Want to opt out of the daily emails? Follow these steps. You can still receive essays and the Sunday edition as usual.

Exponential View is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

🟣 EV Daily: Two Moonshots — one hit, one miss

2025-07-14 18:24:21

Lead story: Two Moonshots — one hit, one miss

Two headlines over the weekend capture the AI industry’s divergent tempos. In Beijing, Moonshot AI open-sourced Kimi K2, a trillion-parameter mixture-of-experts model that tops GPT-4.1 on coding and math tests and is free to download. In San Francisco, OpenAI hit the brakes yet again, postponing its own open-weight release “indefinitely” for extra safety checks. The difference between approaches is stark. The insurgent treats cutting-edge capability as a public good, accelerating adoption and learning curves; the incumbent, wary of reputational and regulatory blow-back, keeps its crown jewels under wraps a little longer. The releases of DeepSeek R1 and Kimi K2 show how China’s open-source push is eroding the idea of durable moats. The only thing likely to slow that momentum now is the tightening grip of chip controls. [VentureBeat, TechCrunch] #AIModels

Subscribe now

Key signals, quick scan

A 30-second scan of four secondary signals that hint at where the curve is bending.

  • Despite US export controls, Beijing has acquired around 115,000 top-tier Nvidia GPUs (such as H100s) to power dozens of data centers in its new AI “desert hub.” [DigiTimes] #Geopolitics #Hardware

  • Goldman Sachs is deploying hundreds of instances of Devin, a self-directing AI engineer capable of reading tickets, fixing bugs and committing code without human keystrokes. If successful, the model could reduce the need for junior developers (Goldman has 12,000 devs) and compress multi-sprint cycles into hours. [CNBC] #Economics

  • Google’s $2.4bn cash-plus-acquihire of Windsurf, weeks after Meta’s recruitment blitz, shows Big Tech will now write multi-billion-dollar checks simply to lock down scarce LLM talent, usage data and codebases. [TechCrunch] #SoftwareApps

  • Two-thirds of UK children aged 9-17 use AI chatbots regularly, and 35% of that group say their interactions feel like talking to a friend. “It’s not a game to me,” one 13-year-old boy told nonprofit Internet Matters, which conducted the survey of 1,000 people, “because sometimes they can feel like a real person and a friend.” [Futurism] #Society

    Share

Data of the week

Five data points that show how the world is changing.

  • $750 billion is going into new and in-progress data centers, driven by hyperscalers, AI-native clouds, and sovereign funds betting on infrastructure. [X/ShanuMatthews] #Hardware

  • Gemini traffic jumped 148% between December 2024 and June 2025, triple ChatGPT’s 46%, though ChatGPT still holds 78.6% of the market compared with Google’s 8.6%. [Similarweb] #Economics

  • Anthropic’s annualized revenue has quadrupled to $4 billion since the start of the year, nearly half that of OpenAI’s. [X/Deedydas] #Economics

  • Nvidia’s 2025 GPU output is expected to fall by 10% (500,000 units) because of ongoing production bottlenecks. [X/Rwan07] #Hardware

  • Paid AI adoption among US firms fell 0.5% this month, the first decline since the Ramp AI Index began tracking.1 [RampAI] #Economics

Share


Was this email useful?

Thanks! Want to share this with someone like you? Forward the email.


Want to opt out of the daily emails? Follow these steps. You can still receive essays and the Sunday edition as usual.

1

The Ramp AI Index tracks paid adoption of AI products and services across more than 40,000 US businesses, using anonymised corporate card and bill pay data.

🔮 Sunday edition #532: The speed trap in AI, Grok’s ecosystem play, and truth by citation++

2025-07-13 11:36:18

Hi it’s Azeem,

This week, I’m thinking about structural advantages, the hidden forces that determine who wins when technology shifts.

In climate, a paradox: could the anti-green agenda actually accelerate innovation? In AI, we explore why speed, not just smarts, could shape the next phase and how xAI may be quietly building an unmatchable platform advantage. Plus, a parable from the age of horses that every CEO should revisit.

Subscribe now


When going backward drives progress

Over the past decade, the big lie of the energy transition was that environmental virtue would drive change. In reality, power – and thus the ability to create change – flows to whoever can produce energy and critical materials at the lowest global price. The Trump administration may have sidelined environmental ideals, but it can’t undo that underlying economic fact.

VC investor and EV member Vinod Khosla suggests something counterintuitive, that America’s rollback of green subsidies might unlock the next climate breakthrough. If mature sectors like wind and solar no longer need state support, that money could be freed up for riskier bets: fusion, super-hot geothermal, or low-carbon steel. Once a technology becomes economically viable, continued subsidies risk distorting the market and propping up winners that no longer need support.

The bet only works, of course, if funds don’t end up propping up coal. But the fundamentals have shifted. In most places, clean energy is already cheaper than fossil fuels. If the US wants to outcompete China, it has no choice but to lean in.


Does AI actually slow you down?

AI was supposed to save time. In software development, it might be doing the opposite.

A new METR study found that experienced open-source developers using early-2025 AI tools (Cursor Pro with Claude 3.5 and 3.7) actually took 19% longer to complete tasks. Developers had expected a 24% speedup. Experts had predicted even more. In reality, AI added overhead when used in complex projects.

The main drag on performance came from the time spent prompting, waiting, and verifying outputs, often compounded by over-reliance on flawed suggestions.

Anecdotally, I’ve seen the same issue elsewhere.

Read more

🔮 Task first, pay later: AI’s twin routes for wages and work

2025-07-12 13:22:55

An earlier version of this post incorrectly stated that Recruit Holdings cut 1,300 HR and recruiting roles. In fact, the layoffs affected roles across a range of departments and functions. The edition below has been updated to reflect this more accurately. Thanks to EV member Tom W. for flagging the error.


A Mad Max economy, where your hard-earned expertise trades at commodity prices, is a plausible future of work. One type of AI automation deployment could move millions into low-paid service roles even as employment survives on paper.

However, that same technology could also have the opposite effect. It could turn scarce, high-paying jobs into mass-market opportunities. Furthermore, create jobs that don’t yet exist.

MIT economists David Autor and Neil Thompson explore what is at stake in their landmark study. After analyzing four decades of data across 303 US occupations, they found that AI’s wage effects hinge on one crucial factor. That factor is what gets automated. Not whether firms adopt AI, but which tasks they hand over to machines.

Let’s look into this further.

A Mad Max economy is a choice, not our fate. It’s a good movie, though.

Two pathways 

When firms automate the complex bits of a role—say, the knowledge or judgment-heavy tasks—wages tend to fall, while employment increases. Autor and Thompson's data shows that a one-standard-deviation drop in task expertise over a decade is linked to an approximate 18% wage decline while employment roughly doubles.

Telephone operators from 1980-2018 are the canonical case, where technical simplification made it easier for more people to enter, but at a lower pay.

And Uber is a modern parallel. By breaking the stranglehold that medallion owners once held over New York’s taxi market, the mix of GPS routing and app-based matching opened ride-hailing to a far broader pool of drivers and passengers. Between 2014 and 2024 the city’s total ride market, measured by fares, expanded 228%, while the number of active drivers nearly doubled from 44,000 to 95,000. This greater supply cut average earnings – after allowing for roughly 32% inflation, a typical yellow-cab driver earned 10-15% less in real terms. Notably, about two-thirds of former taxi drivers left the sector altogether, with some transitioning to drive for Uber and others pursuing alternative employment.

The other path looks very different.

Several major firms are now embracing routine-task automation, targeting predictable and codified tasks like call handling, screening, scheduling. The outcome flips: employment shrinks, but wages rise for those who remain. BT plans to cut up to 55,000 jobs – about 40% of its workforce – by 2030, as fibre roll-out and generative AI take over routine customer service work. Recruit Holdings, which owns Indeed and Glassdoor, just cut 1,300 roles after embedding large-language models across its platforms. At Amazon, where robots already outnumber human staff, CEO Andy Jassy admits that generative “agents” will eliminate some job families even as they create new technical ones. In each case, automation strips out low-complexity work. Headcount contracts, the skills bar rises and the wage ladder steepens. This may already be the case for AI coding roles, with sky-high salaries for super-coders, widening the gap between them and the average coder.

Yet this outcome isn’t inevitable.

Read more