2026-04-09 21:00:01

By many estimates, quantum computers will need millions of qubits to realize their potential in applications in cybersecurity, drug development, and other industries. The problem is, anyone who has wanted to simultaneously control millions of a certain kind of qubits has run into the problem of trying to control millions of laser beams.
That’s exactly the challenge scientists from MIT, the University of Colorado at Boulder, Sandia National Laboratories, and the MITRE Corporation were trying to overcome when they developed an image projection technology that they realized could also be the fix for a host of other challenges in augmented reality, biomedical imaging, and elsewhere. It comes in the form of a less-than-0.1-square-millimeter photonic chip capable of projecting the Mona Lisa onto an area smaller than the size of two human egg cells.
“When we started, we certainly never would have anticipated that we would be making a technology that might revolutionize imaging,” says Matt Eichenfield, one of the leaders of the diamond-based quantum computer effort, called Quantum Moonshot, and a professor of quantum engineering at the University of Colorado at Boulder. Their chip is capable of projecting 68.6 million individual spots of light—called scannable pixels to differentiate them from physical pixels— per second per square millimeter, more than fifty times the capability of previous technology, such as micro-electromechanical systems (MEMS) micromirror arrays.
“We have now made a scannable pixel that is at the absolute limit of what diffraction allows,” says Henry Wen, a visiting researcher at MIT and a photonics engineer at QuEra Computing.
The chip’s distinguishing feature is an array of tiny metallic cantilevers, which curve away from the plane of the chip in response to voltage and act as miniature “ski-jumps” for light. Light is channeled along the length of each cantilever via a waveguide, and exits at its tip. The cantilevers contain a thin layer of aluminum nitride, a piezoelectric which expands or contracts under voltage, thus moving the micromachine up and down and enabling the array to scan beams of light over a two-dimensional area.
Despite the magnitude of the team’s achievement, Eichenfield says that the process of engineering the cantilevers was “pretty smooth.” Each cantilever is composed of a stack of four thin layers of material and is curled approximately 90 degrees out of the plane at rest. To achieve such a high curvature, the team took advantage of differences in the contraction and expansion of individual layers when cooled. On top of its four layers of material, each cantilever also features a series of silicon dioxide bars running perpendicular to the waveguide, which keep the cantilever from curling along its width.
A micro-cantilever wiggles and waggles to project light in the right place.Matt Saha, Y. Henry Wen, et al.
What was more of a challenge than engineering the chip itself was figuring out the details of actually making the chip project images and videos. Working out the process of synchronizing and timing the cantilevers’ light beams to generate the right colors at the right time was a substantial effort, according to Andy Greenspon, a researcher at MITRE who also worked on the project. Now, the team has successfully projected the movie A Charlie Brown Christmas through the chip.
The chip projected a roughly 125-micrometer image of the Mona Lisa.Matt Saha, Y. Henry Wen, et al.
Because the chip can project so many more spots in any given time interval than any previous beam scanners, it could also be used to control many more qubits in quantum computers. The Quantum Moonshot program’s mission is to build a quantum computer that can be scaled to millions of qubits. So clearly, it needs a scalable way of controlling each one, explains Wen. Instead of using one laser per qubit, the team realized that not every qubit needed to be controlled at every given moment. The chip’s ability to move light beams over a two-dimensional area, would allow them to control all of the qubits with many fewer lasers.
Another process that Wen thinks the chip could improve is scanning objects for 3D printing. Today, that typically involves using a single laser to scan over the entire surface of an object. The new chip, however, could potentially employ thousands of laser beams. “I think now you can take a process that would have taken hours and maybe bring it down to minutes,” says Wen.
Wen is also excited to explore the potential of different cantilever shapes. By changing the orientations of the bars perpendicular to the waveguide, the team has been able to make the cantilevers curl into helixes. Wen says that such unusual shapes could be useful in making a lab-on-a-chip for cell biology or drug development. “A lot of this stuff is imaging, scanning a laser across something, either to image it or to stimulate some response. And so we could have one of these ski jumps curl not just up, but actually curl back around, and then move around and scan over a sample,” Wen explains. “If you can imagine a structure that will be useful for you, we should try it.”
2026-04-08 02:00:02

Kyle McGinley graduated from high school in 2018 and, like many teenagers, he was unsure what career he wanted to pursue. Recuperating from a sports injury led him to consider becoming a physical therapist for athletes. But he was skilled at repairing cars and fixing things around the house, so he thought about becoming an engineer, like his father.
McGinley, who lives in Sellersville, Pa., took some classes at Montgomery County Community College in Blue Bell, while also working. During his years at the college, he took a variety of courses and was drawn to electrical engineering and computing, he says. He left to pursue a bachelor’s degree in electrical and computer engineering in Philadelphia at Temple University, where he is currently a junior.
Student member
UNIVERSITY
Temple, in Philadelphia
MAJOR
Electrical and computer engineering
The 26-year-old is also a teaching assistant and a research assistant at Temple. His research focuses on applying artificial intelligence to electrical hardware and robotics. He helped build an AI-integrated android companion to assist in-home caregivers.
Temple recognized McGinley’s efforts last year with its Butz scholarship, which is awarded annually to an electrical and computer engineering undergraduate with an interest in software development, AI development systems, health education software, or a similar field.
An IEEE student member, he is active within the university’s student branch.
“Dr. Brian Butz, the late professor emeritus, dedicated his research to artificial intelligence,” McGinley says. “The scholarship he and his wife Susan established helps allow students to pursue research in AI. Their generous donation has helped fund my research.”
McGinley is a teaching assistant for his digital circuit design course. In a class of 35 students, it can be a struggle for some to digest the professor’s words, he says.
“My job is to answer students’ questions if they are having problems following the professor’s lecture or are confused about any of the topics,” he says. “In the lab, I help students debug code or with hardware issues they have on the FPGA [field-programmable gate array] boards.”
He also conducts research for the university’s Computer Fusion Lab under the supervision of IEEE Senior Member Li Bai, a professor of electrical and computer engineering. McGinley writes software programs at the lab.
“In school, they don’t teach you how to communicate with people. They only teach you how to remember stuff. Working well with people is one of the most underrated skills that a lot of students don’t understand is important.”
One such assignment was working with the Temple School of Social Work at the Barnett College of Public Health to build a robot companion integrated with AI to assist individuals with Parkinson’s disease and their caregivers.
“I realized the need for this with my grandmother, when she was taking care of my grandfather,” he says. “It was a lot for her, trying to remember everything.”
Using the latest software and hardware, he and three classmates rebuilt an older lab robot. They installed an operating system and used Python and C++ for its control, perception, and behavior, he says. The students also incorporated Google’s Gemini AI to help with routine tasks such as scheduling medication reminders and setting alarms for upcoming doctor visits.
Kyle McGinley helped build an AI-integrated android to assist individuals with Parkinson’s disease and their caregivers.Temple University of Public Health
The AI-integrated android was intended to assist, not replace, the caregivers by handling the mental load of remembering tasks, he says.
“This was one of the cool things that drew me to working in the robotics field,” he says. “Something where AI could be used to help caregivers do simple tasks.
“My career ambition after I graduate is to gain real-world experience in the engineering industry to learn skills outside of academia, Long term, I want to do project management or work in a technical lead role, with the primary goal of creating impactful projects that I can be proud of.”
McGinley joined Temple’s IEEE student branch last year after one of his professors offered extra credit to students who did so. After attending meetings and participating in a few workshops, he found he really liked the club, he says, adding that he made new friends and enjoyed the camaraderie with other engineering students.
After the student branch’s board members got to know McGinley better, they asked him to become the club’s historian and manage its social media account. He also helps with event planning, creating and posting fliers, taking pictures, and shooting videos of the gatherings.
The branch has benefited from McGinley’s involvement, but he says it’s a two-way street.
“The biggest things I’ve learned are being held accountable and being reliable,” he says. “I am responsible for other people knowing what’s going on.”
Being an active volunteer has improved his communication skills, he says.
“Learning to clearly communicate with other people to make sure everyone is on the same page is important,” he says. “In school, they don’t teach you how to communicate with people. They only teach you how to remember stuff. Working well with people is one of the most underrated skills that a lot of students don’t understand is important.”
He encourages students to join their university’s IEEE branch.
“I know it can be scary because you might not know anyone, but it honestly can’t hurt you; it could actually benefit you,” he says. “Being active is going to help you with a lot of skills that you need.
“You’ll definitely get opportunities that you would have never known about, like a scholarship or working in the research lab. I would have never gotten these opportunities if I hadn’t shown up. Joining IEEE and being active is the best thing you can do for your career.”
This article was updated on 9 April 2026.
2026-04-07 22:00:01

Artificial intelligence harbors an enormous energy appetite. Such constant cravings are evident in the hefty carbon footprint of the data centers behind the AI boom and the steady increase over time of carbon emissions from training frontier AI models.
No wonder big tech companies are warming up to nuclear energy, envisioning a future fueled by reliable, carbon-free sources. But while nuclear-powered data centers might still be years away, some in the research and industry spheres are taking action right now to curb AI’s growing energy demands. They’re tackling training as one of the most energy-intensive phases in a model’s life cycle, focusing their efforts on decentralization.
Decentralization allocates model training across a network of independent nodes rather than relying on one platform or provider. It allows compute to go where the energy is—be it a dormant server sitting in a research lab or a computer in a solar-powered home. Instead of constructing more data centers that require electric grids to scale up their infrastructure and capacity, decentralization harnesses energy from existing sources, avoiding adding more power into the mix.
Training AI models is a huge data center sport, synchronized across clusters of closely connected GPUs. But as hardware improvements struggle to keep up with the swift rise in size of large language models, even massive single data centers are no longer cutting it.
Tech firms are turning to the pooled power of multiple data centers—no matter their location. Nvidia, for instance, launched the Spectrum-XGS Ethernet for scale-across networking, which “can deliver the performance needed for large-scale single job AI training and inference across geographically separated data centers.” Similarly, Cisco introduced its 8223 router designed to “connect geographically dispersed AI clusters.”
Other companies are harvesting idle compute in servers, sparking the emergence of a GPU-as-a-Service business model. Take Akash Network, a peer-to-peer cloud computing marketplace that bills itself as the “Airbnb for data centers.” Those with unused or underused GPUs in offices and smaller data centers register as providers, while those in need of computing power are considered as tenants who can choose among providers and rent their GPUs.
“If you look at [AI] training today, it’s very dependent on the latest and greatest GPUs,” says Akash cofounder and CEO Greg Osuri. “The world is transitioning, fortunately, from only relying on large, high-density GPUs to now considering smaller GPUs.”
In addition to orchestrating the hardware, decentralized AI training also requires algorithmic changes on the software side. This is where federated learning, a form of distributed machine learning, comes in.
It starts with an initial version of a global AI model housed in a trusted entity such as a central server. The server distributes the model to participating organizations, which train it locally on their data and share only the model weights with the trusted entity, explains Lalana Kagal, a principal research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) who leads the Decentralized Information Group. The trusted entity then aggregates the weights, often by averaging them, integrates them into the global model, and sends the updated model back to the participants. This collaborative training cycle repeats until the model is considered fully trained.
But there are drawbacks to distributing both data and computation. The constant back and forth exchanges of model weights, for instance, result in high communication costs. Fault tolerance is another issue.
“A big thing about AI is that every training step is not fault-tolerant,” Osuri says. “That means if one node goes down, you have to restore the whole batch again.”
To overcome these hurdles, researchers at Google DeepMind developed DiLoCo, a distributed low-communication optimization algorithm. DiLoCo forms what Google DeepMind research scientist Arthur Douillard calls “islands of compute,” where each island consists of a group of chips. Every island holds a different chip type, but chips within an island must be of the same type. Islands are decoupled from each other, and synchronizing knowledge between them happens once in a while. This decoupling means islands can perform training steps independently without communicating as often, and chips can fail without having to interrupt the remaining healthy chips. However, the team’s experiments found diminishing performance after eight islands.
An improved version dubbed Streaming DiLoCo further reduces the bandwidth requirement by synchronizing knowledge “in a streaming fashion across several steps and without stopping for communicating,” says Douillard. The mechanism is akin to watching a video even if it hasn’t been fully downloaded yet. “In Streaming DiLoCo, as you do computational work, the knowledge is being synchronized gradually in the background,” he adds.
AI development platform Prime Intellect implemented a variant of the DiLoCo algorithm as a vital component of its 10-billion-parameter INTELLECT-1 model trained across five countries spanning three continents. Upping the ante, 0G Labs, makers of a decentralized AI operating system, adapted DiLoCo to train a 107-billion-parameter foundation model under a network of segregated clusters with limited bandwidth. Meanwhile, popular open-source deep learning framework PyTorch included DiLoCo in its repository of fault tolerance techniques.
“A lot of engineering has been done by the community to take our DiLoCo paper and integrate it in a system learning over consumer-grade internet,” Douillard says. “I’m very excited to see my research being useful.”
With hardware and software enhancements in place, decentralized AI training is primed to help solve AI’s energy problem. This approach offers the option of training models “in a cheaper, more resource-efficient, more energy-efficient way,” says MIT CSAIL’s Kagal.
And while Douillard admits that “training methods like DiLoCo are arguably more complex, they provide an interesting tradeoff of system efficiency.” For instance, you can now use data centers across far apart locations without needing to build ultrafast bandwidth in between. Douillard adds that fault tolerance is baked in because “the blast radius of a chip failing is limited to its island of compute.”
Even better, companies can take advantage of existing underutilized processing capacity rather than continuously building new energy-hungry data centers. Betting big on such an opportunity, Akash created its Starcluster program. One of the program’s aims involves tapping into solar-powered homes and employing the desktops and laptops within them to train AI models. “We want to convert your home into a fully functional data center,” Osuri says.
Osuri acknowledges that participating in Starcluster will not be trivial. Beyond solar panels and devices equipped with consumer-grade GPUs, participants would also need to invest in batteries for backup power and redundant internet to prevent downtime. The Starcluster program is figuring out ways to package all these aspects together and make it easier for homeowners, including collaborating with industry partners to subsidize battery costs.
Backend work is already underway to enable homes to participate as providers in the Akash Network, and the team hopes to reach its target by 2027. The Starcluster program also envisions expanding into other solar-powered locations, such as schools and local community sites.
Decentralized AI training holds much promise to steer AI toward a more environmentally sustainable future. For Osuri, such potential lies in moving AI “to where the energy is instead of moving the energy to where AI is.”
2026-04-07 21:00:01

Picture a highway with networked autonomous cars driving along it. On a serene, cloudless day, these cars need only exchange thimblefuls of data with one another. Now picture the same stretch in a sudden snow squall: The cars rapidly need to share vast amounts of essential new data about slippery roads, emergency braking, and changing conditions.
These two very different scenarios involve vehicle networks with very different computational loads. Eavesdropping on network traffic using a ham radio, you wouldn’t hear much static on the line on a clear, calm day. On the other hand, sudden whiteout conditions on a wintry day would sound like a cacophony of sensor readings and network chatter.
Normally this cacophony would mean two simultaneous problems: congested communications and a rising demand for computing power to handle all the data. But what if the network itself could expand its processing capabilities with every rising decibel of chatter and with every sensor’s chirp?
Traditional wireless networks treat communication as separate from computation. First you move data, then you process it. However, an emerging new paradigm called over-the-air computation (OAC) could fundamentally change the game. First proposed in 2005 and recently developed and prototyped by a number of teams around the world, including ours, OAC combines communication and computation into a single framework. This means that an OAC sensor network—whether shared among autonomous vehicles, Internet-of-Things sensors, smart-home devices, or smart-city infrastructure—can carry some of the network’s computing burden as conditions demand.
The idea takes advantage of a basic physical fact of electromagnetic radiation: When multiple devices transmit simultaneously, their wireless signals naturally combine in the air. Normally, such cross talk is seen as interference, which radios are designed to suppress—especially digital radios with their error-correcting schemes and inherent resistance to low-level noise.
But if we carefully design the transmissions, cross talk can enable a wireless network to directly perform some calculations, such as a sum or an average. Some prototypes today do this with analog-style signaling on otherwise digital radios—so that the superimposed waveforms represent numbers that have been added before digital signal processing takes place.
Researchers are also beginning to explore digital, over-the-air computation schemes, which embed the same ideas into digital formats, ultimately allowing the prototype schemes to coexist with today’s digital radio protocols. These various over-the-air computation techniques can help networks scale gracefully, enabling new classes of real-time, data-intensive services while making more efficient use of wireless spectrum.
OAC, in other words, turns signal interference from a problem into a feature, one that can help wireless systems support massive growth.
For decades, engineers designed radio communications protocols with one overriding goal: to isolate each signal and recover each message cleanly. Today’s networks face a different set of pressures. They must coordinate large groups of devices on shared tasks—such as AI model training or combining disparate sensor readings, also known as sensor fusion—while exchanging as little raw data as possible, to improve both efficiency and privacy. For these reasons, a new approach to transmitting and receiving data may be worth considering, one that doesn’t rely on collecting and storing every individual device’s contributions.
By turning interference into computation, OAC transforms the wireless medium from a contested battlefield into a collaborative workspace. This paradigm shift has far-reaching consequences: Signals no longer compete for isolation; they cooperate to achieve shared outcomes. OAC cuts through layers of digital processing, reduces latency, and lowers energy consumption.
Even very simple operations, such as addition, can be the building blocks of surprisingly powerful computations. Many complex processes can be broken down into combinations of simpler pieces, much like how a rich sound can be re-created by combining a few basic tones. By carefully shaping what devices transmit and how the result is interpreted at the receiver, the wireless channel running OAC can carry out other calculations beyond addition. In practice, this means that with the right design, wireless signals can compute a number of key functions that modern algorithms rely on.
For instance, many key tasks in modern networks don’t require the logging and storage of every individual network transmission. Rather, the goal is instead to infer properties about aggregate patterns of network traffic—reaching agreement or identifying what matters most about the traffic. Consensus algorithms rely on majority voting to ensure reliable decisions, even when some devices fail. Artificial intelligence systems depend on matrix reduction and simplification operations such as “max pooling” (keeping only peak values) to extract the most useful signals from noisy data.
In smart cities and smart grids, what matters most is often not individual readings but distribution. How many devices report each traffic condition? What is the range of demand across neighborhoods? These are histogram questions—summaries of the device counts per category.
With type-based multiple access (TBMA), an over-the-air computation method we use, devices reporting a given condition transmit together over a shared channel. Their signals add up, and the receiver sees only the total signal strength per category. In a single transmission, the entire histogram emerges without ever identifying individual devices. And the more devices there are, the better the estimate. The result is greater spectrum efficiency, with lower latency and scalable, privacy-friendly operations—all from letting the wireless medium do the aggregating and counting.
It’s easy to imagine how analog values transmitted over the air could be summed via superposition. The amplitudes from different signals add together, so the values those amplitudes represent also simply add together. The more challenging question concerns preserving that additive magic, but with digital signals.
Here’s how OAC does it. Consider, for instance, one TBMA approach for a network of sensors that gives each possible sensor reading its own dedicated frequency channel. Every sensor on the network that reads “4” transmits on frequency four; every sensor that reads “7” transmits on frequency seven. When multiple devices share the same reading, their amplitudes combine. The stronger the combined signal at a given frequency, the more devices there are reporting that particular value.
A receiver equipped with a bank of filters tuned to each frequency reads out a count of votes for every possible sensor value. In a single, simultaneous transmission, the whole network has reported its state.
It might seem paradoxical—digital computation riding atop what appears to be an analog physical effect. But this is also true of all “digital” radio. A Wi-Fi transmitter does not launch ones and zeroes into the air; it modulates electromagnetic waves whose amplitudes and phases encode digital data. The “digital” label ultimately refers to the information layer, not the physics. What makes OAC digital, in the same sense, is that the values being computed—each sensor reading, each frequency-bin count—are discrete and quantized from the start. And because they are discrete, the same error-correction machinery that has made digital communications robust for decades can be applied here too.
Synchronization is where OAC’s demands diverge most sharply from digital wireless conventions. Many OAC variants today require something akin to a shared clock at nanosecond precision: Every signal’s phase must be synchronized, or the superposition runs the risk of collapsing into destructive interference. While TBMA relaxes this burden a bit—devices need only share a time window—real engineering challenges lie ahead regardless, before over-the-air computation is ready for the mobile world.
Over-the-air computation has in recent years moved from theory to initial proofs-of-concept and network test runs. Our research teams in South Carolina and Spain have built working prototypes that deliver repeatable results—with no cables and no external timing sources such as GPS-locked references. All synchronization is handled within the radios themselves.
Our team at the University of South Carolina (led by Sahin) started with off-the-shelf software-defined radios—Analog Devices’ Adalm-Pluto. We modified the devices’ field-programmable gate array hardware inside each radio so it can respond to a trigger signal transmitted from another radio. This simple hack enabled simultaneous transmission, a core requirement for OAC. Our setup used five radios acting as edge devices and one acting as a base station. The task involved training a neural network to perform image recognition over the air. Our system, whose results we first reported in 2022, achieved a 95 percent accuracy in image recognition without ever moving raw data across the network.
We also demonstrated our initial OAC setup at a March 2025 IEEE 802.11 working group meeting, where an IEEE committee was studying AI and machine learning capabilities for future Wi-Fi standards. As we showed, OAC’s road ahead doesn’t necessarily require reinventing wireless technology. Rather, it can also build on and repurpose existing protocols already in Wi-Fi and 5G.
However, before OAC can become a routine feature of commercial wireless systems, networks must provide finer-tuned coordination of timing and signal power levels. Mobility is a difficult problem, too. When mobile devices move around, phase synchronization degrades quickly, and computational accuracy can suffer. Present-day OAC tests work in controlled lab environments. But making them robust in dynamic, real-world settings—vehicles on highways, sensors scattered across cities—remains a new frontier for this emerging technology.
Both of our teams are now scaling up our prototypes and demonstrations. We are together aiming to understand how over-the-air computation performs as the number of devices increases beyond lab-bench scales. Turning prototypes and test-beds into production systems for autonomous vehicles and smart cities will require anticipating tomorrow’s mobility and synchronization problems—and no doubt a range of other challenges down the road.
To realize the technological ambitions of over-the-air computation, nanosecond timing and exquisite RF signal design will be crucial. Fortunately, recent engineering advances have made substantial progress in both of these fields.
Because OAC demands waveform superposition, it benefits from tight coordination in time, frequency, phase, and amplitude among RF transmitters. Such requirements build naturally on decades of work in wireless communication systems designed for shared access. Modern networks already synchronize large numbers of devices using high-precision timing and uplink coordination.
OAC uses the same synchronization techniques already in cellular and Wi-Fi systems. But to actually run over-the-air computations, more precision still will be needed. Power control, gain adjustment, and timing calibration are standard tools today. We expect that engineers will further refine these existing methods to begin to meet OAC’s more stringent accuracy demands.

In some cases, in fact, imperfect timing standards may be all that’s needed. Designs and emerging standards in 5G and 6G wireless systems today use clever encoding that tolerates imperfect synchronization. Minor timing errors, frequency drift, and signal overlap can in some cases still work capably within an OAC protocol, we anticipate. Instead of fighting messiness, over-the-air computation may sometimes simply be able to roll with it.
Another challenge ahead concerns shifting processing to the transmitter. Instead of the receiver trying to clean up overlapping signals, a better and more efficient approach would involve each transmitter fixing its own signal before sending. Such “pre-compensation” techniques are already used in MIMO technology (multi-antenna systems in modern Wi-Fi and cellular networks). OAC would just be repurposing techniques that have already been developed for 5G and 6G technologies.
Materials science can also help OAC efforts ahead. New generations of reconfigurable intelligent surfaces shape signals via tiny adjustable elements in the antenna. The surfaces catch radio signals and reshape them as they bounce around. Reconfigurable surfaces can strengthen useful signals, eliminate interference, and synchronize wavefront arrivals that would otherwise be out of sync. OAC stands to benefit from these and other emerging capabilities that intelligent surfaces will provide.
At the system level, OAC will represent a fundamental shift in wireless network system design. Wireless engineers have traditionally tried to avoid designing devices that transmit at the same time. But over-the-air systems will flip the old, familiar design standards on their head.
One might object that OAC stands to upend decades of existing wireless signal standards that have always presumed data pipes to be data pipes only—not microcomputers as well. Yet we do not anticipate much difficulty merging OAC with existing wireless standards. In a sense, in fact, the IEEE 802.11 and 3GPP (3rd Generation Partnership Project) standards bodies have already shown the way.
A network can set aside certain brief time windows or narrow slices of bandwidth for over‑the‑air computation, and use the rest for ordinary data. From the radio’s point of view, OAC just becomes another operating mode that is turned on when needed and left off the rest of the time.
Over the past decade, both the IEEE and 3GPP have integrated once-experimental technologies into their wireless standards—for example, millimeter-wave mobile communications, multiuser MIMO, beamforming, and network slicing—by defining each new technological advance as an optional feature. OAC, we suggest, can also operate alongside conventional wireless data traffic as an optional service. Because OAC places high demands on timing and accuracy, networks will need the ability to enable or disable over‑the‑air computation on a per‑application basis.
With continued progress, OAC will evolve from lab prototype to standardized wireless capability through the 2020s and into the decade ahead. In the process, the wireless medium will transform from a passive data carrier into an active computational partner—providing essential infrastructure for the real-time intelligent systems that future wireless technologies will demand.
So on that snowy highway sometime in the 2030s, vehicles and sensors won’t wait for permission to think together. Using the emerging over-the-air computation protocols that we’re helping to pioneer, simultaneous computation will be the new default. The networks will work as one.
2026-04-07 21:00:01

In late-stage testing of a distributed AI platform, engineers sometimes encounter a perplexing situation: every monitoring dashboard reads “healthy,” yet users report that the system’s decisions are slowly becoming wrong.
Engineers are trained to recognize failure in familiar ways: a service crashes, a sensor stops responding, a constraint violation triggers a shutdown. Something breaks, and the system tells you. But a growing class of software failures looks very different. The system keeps running, logs appear normal, and monitoring dashboards stay green. Yet the system’s behavior quietly drifts away from what it was designed to do.
This pattern is becoming more common as autonomy spreads across software systems. Quiet failure is emerging as one of the defining engineering challenges of autonomous systems because correctness now depends on coordination, timing, and feedback across entire systems.
Consider a hypothetical enterprise AI assistant designed to summarize regulatory updates for financial analysts. The system retrieves documents from internal repositories, synthesizes them using a language model, and distributes summaries across internal channels.
Technically, everything works. The system retrieves valid documents, generates coherent summaries, and delivers them without issue.
But over time, something slips. Maybe an updated document repository isn’t added to the retrieval pipeline. The assistant keeps producing summaries that are coherent and internally consistent, but they’re increasingly based on obsolete information. Nothing crashes, no alerts fire, every component behaves as designed. The problem is that the overall result is wrong.
From the outside, the system looks operational. From the perspective of the organization relying on it, the system is quietly failing.
One reason quiet failures are difficult to detect is that traditional systems measure the wrong signals. Operational dashboards track uptime, latency, and error rates, the core elements of modern observability. These metrics are well-suited for transactional applications where requests are processed independently, and correctness can often be verified immediately.
Autonomous systems behave differently. Many AI-driven systems operate through continuous reasoning loops, where each decision influences subsequent actions. Correctness emerges not from a single computation but from sequences of interactions across components and over time. A retrieval system may return contextually inappropriate and technically valid information. A planning agent may generate steps that are locally reasonable but globally unsafe. A distributed decision system may execute correct actions in the wrong order.
None of these conditions necessarily produces errors. From the perspective of conventional observability, the system appears healthy. From the perspective of its intended purpose, it may already be failing.
The deeper issue is architectural. Traditional software systems were built around discrete operations: a request arrives, the system processes it, and the result is returned. Control is episodic and externally initiated by a user, scheduler, or external trigger.
Autonomous systems change that structure. Instead of responding to individual requests, they observe, reason, and act continuously. AI agents maintain context across interactions. Infrastructure systems adjust resource in real time. Automated workflows trigger additional actions without human input.
In these systems, correctness depends less on whether any single component works, and more on coordination across time.
Distributed-systems engineers have long wrestled with issues of coordination. But this is coordination of a new kind. It’s no longer about things like keeping data consistent across services. It’s about ensuring that a stream of decisions—made by models, reasoning engines, planning algorithms, and tools, all operating with partial context—adds up to the right outcome.
A modern AI system may evaluate thousands of signals, generate candidate actions, and execute them across a distributed infrastructure. Each action changes the environment in which the next decision is made. Under these conditions, small mistakes can compound. A step that is locally reasonable can still push the system further off course.
Engineers are beginning to confront what might be called behavioral reliability: whether an autonomous system’s actions remain aligned with its intended purpose over time.
When organizations encounter quiet failures, the initial instinct is to improve monitoring: deeper logs, better tracing, more analytics. Observability is essential, but it only shows that the behavior has already diverged—it doesn’t correct it.
Quiet failures require something different: the ability to shape system behavior while it is still unfolding. In other words, autonomous systems increasingly need control architectures, not just monitoring.
Engineers in industrial domains have long relied on supervisory control systems. These are software layers that continuously evaluate a system’s status and intervene when behavior drifts outside safe bounds. Aircraft flight-control systems, power-grid operations, and large manufacturing plants all rely on such supervisory loops. Software systems historically avoided them because most applications didn’t need them. Autonomous systems increasingly do.
Behavioral monitoring in AI systems focuses on whether actions remain aligned with intended purpose, not just whether components are functioning. Instead of relying only on metrics such as latency or error rates, engineers look for signs of behavior drift: shifts in outputs, inconsistent handling of similar inputs, or changes in how multi-step tasks are carried out. An AI assistant that begins citing outdated sources, or an automated system that takes corrective actions more often than expected, may signal that the system is no longer using the right information to make decisions. In practice, this means tracking outcomes and patterns of behavior over time.
Supervisory control builds on these signals by intervening while the system is running. A supervisory layer checks whether ongoing actions remain within acceptable bounds and can respond by delaying or blocking actions, limiting the system to safer operating modes, or routing decisions for review. In more advanced setups, it can adjust behavior in real time—for example, by restricting data access, tightening constraints on outputs, or requiring extra confirmation for high-impact actions.
Together, these approaches turn reliability into an active process. Systems don’t just run, they are continuously checked and steered. Quiet failures may still occur, but they can be detected earlier and corrected while the system is operating.
Preventing quiet failures requires a shift in how engineers think about reliability: from ensuring components work correctly to ensuring system behavior stays aligned over time. Rather than assuming that correct behavior will emerge automatically from component design, engineers must increasingly treat behavior as something that needs active supervision.
As AI systems become more autonomous, this shift will likely spread across many domains of computing, including cloud infrastructure, robotics, and large-scale decision systems. The hardest engineering challenge may no longer be building systems that work, but ensuring that they continue to do the right thing over time.
2026-04-06 22:22:58

While browsing our website a few weeks ago, I stumbled upon “How and When the Memory Chip Shortage Will End” by Senior Editor Samuel K. Moore. His analysis focuses on the current DRAM shortage caused by AI hyperscalers’ ravenous appetite for memory, a major constraint on the speed at which large language models run. Moore provides a clear explanation of the shortage, particularly for high bandwidth memory (HBM).
As we and the rest of the tech media have documented, AI is a resource hog. AI electricity consumption could account for up to 12 percent of all U.S. power by 2028. Generative AI queries consumed 15 terawatt-hours in 2025 and are projected to consume 347 TWh by 2030. Water consumption for cooling AI data centers is predicted to double or even quadruple by 2028 compared to 2023.
But Moore’s reporting shines a light on an obscure corner of the AI boom. HBM is a particular type of memory product tailor-made to serve AI processors. Makers of those processors, notably Nvidia and AMD, are demanding more and more memory for each of their chips, driven by the needs and wants of firms like Google, Microsoft, OpenAI, and Anthropic, which are underwriting an unprecedented buildout of data centers. And some of these facilities are colossal: You can read about the engineering challenges of building Meta’s mind-boggling 5-gigawatt Hyperion site in Louisiana, in “What Will It Take to Build the World’s Largest Data Center?”
We realized that Moore’s HBM story was both important and unique, and so we decided to include it in this issue, with some updates since the original published on 10 February. We paired it with a recent story by Contributing Editor Matthew S. Smith exploring how the memory-chip shortage is driving up the price of low-cost computers like the Raspberry Pi. The result is “AI Is a Memory Hog.”
The big question now is, When will the shortage end? Price pressure caused by AI hyperscaler demand on all kinds of consumer electronics is being masked by stubborn inflation combined with a perpetually shifting tariff regime, at least here in the United States. So I asked Moore what indicators he’s looking for that would signal an easing of the memory shortage.
“On the supply side, I’d say that if any of the big three HBM companies—Micron, Samsung, and SK Hynix—say that they are adjusting the schedule of the arrival of new production, that’d be an important signal,” Moore told me. “On the demand side, it will be interesting to see how tech companies adapt up and down the supply chain. Data centers might steer toward hardware that sacrifices some performance for less memory. Startups developing all sorts of products might pivot toward creative redesigns that use less memory. Constraints like shortages can lead to interesting technology solutions, so I’m looking forward to covering those.”
To be sure you don’t miss any of Moore’s analysis of this topic and to stay current on the entire spectrum of technology development, sign up for our weekly newsletter, Tech Alert.