2025-11-25 22:00:02

When you get an MRI scan, the machine exploits a phenomenon called nuclear magnetic resonance (NMR). Certain kinds of atomic nuclei—including those of the hydrogen atoms in a water molecule—can be made to oscillate in a magnetic field, and these oscillations can be detected with coils of wire. MRI scanners employ intense magnetic fields that create resonances at tens to hundreds of megahertz. However, another NMR-based instrument involves much lower-frequency oscillations: a proton-precession magnetometer, often used to measure Earth’s magnetic field.
Proton-precession magnetometers have been around for decades and were once often used in archaeology and mineral exploration. High-end models can cost thousands of dollars. Then, in 2022 a German engineer named Alexander Mumm devised a very simple circuit for a stripped-down one. I recently built his circuit and can attest that with less than half a kilogram of 22-gauge magnet wire; two common integrated circuits; a metal-oxide-semiconductor field-effect transistor, or MOSFET; a handful of discrete components; and two empty 113-gram bottles of Morton seasoning blend, it’s possible to measure Earth’s magnetic field very accurately.
The frequency of the signal emitted by protons precessing in Earth’s magnetic field lies in the audio range, so with a pair of headphones and two amplifier integrated circuits [middle right], you can detect a signal from water in seasoning bottles wrapped in coils [bottom left and right]. A MOSFET [middle left] allows for rapid control of the coils. The amplification circuitry is powered by a 9-volt battery, while a 36-volt battery charges the coils.James Provost
Like an MRI scanner, a proton-precession magnetometer measures the oscillations of hydrogen nuclei—that is, protons. Like other subatomic particles, protons possess a quantum property called spin, akin to classical angular momentum. In a magnetic field, protons wobble like spinning tops, with their spin axes tracing out a cone—a phenomenon called precession. A proton-precession magnetometer gets many protons to wobble in sync and then measures the frequency of their wobbles, which is proportional to the intensity of the ambient magnetic field.
The weak strength of Earth’s magnetic field (at least compared to that of an MRI machine) means that protons wobbling under its influence do so at audio frequencies. Get enough moving in unison and the spinning protons will induce a voltage in a nearby pickup coil. Amplify that and pass it through some earphones, and you get an audio tone. So with a suitable circuit, you can, literally, hear protons.
The first step is to make the pickup coils, which is where the bottles of Morton seasoning blend come in. Why Morton seasoning blend? Two reasons. First, this size bottle will allow you to wrap about 500 turns of wire around each one with about 450 grams of 22-gauge wire. Second, the bottle has little shoulders molded at each end, making for excellent coil forms.
Why two bottles and two coils? That’s to quash electromagnetic noise—principally coming from power lines—that invariably gets picked up by the coils. When two counterwound coils are wired in series, such external noise tends to cancel out. Signals from precessing protons in the two coils, though, will reinforce one another.
Don’t try this indoors or anywhere near iron-containing objects.
A proton magnetometer has three modes. The first is for sending DC current through the coils. The second mode disconnects the current source and allows the magnetic field it had created to collapse. The third is listening mode, which connects the coils to a sensitive audio amplifier. By filling each bottle with distilled water and sending a DC current (a few amperes) through these coils, you line up the spins of many protons in the water. Then, after putting your circuit into listening mode, you use the coils to sense the synchronous oscillations of the wobbling protons.
Mumm’s circuit shifts from one mode to another in the simplest way possible: using a three-position switch. One position enables the DC-polarization mode. The next allows the magnetic field built up during polarization to collapse, and the third position is for listening.
The second mode might seem easy to achieve—just disconnect the coils, right? But if you do that, the same principle that makes spark plugs spark will put a damaging high voltage across the switch contacts as the magnetic fields around the coils collapse.
The proton-precession magnetometer is primarily just a multistage analog amplifier.James Provost
To avoid that, Mumm’s circuit employs a MOSFET, wired to work like a high-power Zener diode, used in many power-regulation circuits to allow only current above a specified threshold voltage to flow. This limits the voltage that develops across the coils when the current is cut off by just enough so that the magnetometer can shift from polarizing to listening mode quickly but without causing damage.
To pick up a strong signal, the listening circuit must also be tuned to resonate at the expected frequency of proton precession, which will depend on Earth’s magnetic field at your location. You can work out approximately what that is using an online geomagnetic-field calculator. You’ll get the field strength, and then you’ll multiply that by the gyromagnetic ratio of protons (42.577 MHz per tesla). For me, that worked out to about 2 kilohertz. Estimating the inductance of the coils from their diameter and number of turns, I then selected a capacitor of suitable value in parallel with the coils to make a tank circuit that resonates at that frequency.
You could tune your tank circuit using a frequency generator and oscilloscope. Or, as Mumm suggests, attach a small speaker to the output of the circuit. Then bring the speaker near the pickup coils. This will create magnetic feedback and the circuit will oscillate on it’s own—loudly! You merely need to measure the frequency of this tone, and then adjust the tank capacitor to bring this self-oscillation to the frequency you want to tune to.
My initial attempt to listen to protons met with mixed success: Sometimes I heard tones, sometimes not. What helped to get this gizmo working consistently was realizing that proton magnetometers don’t tolerate large gradients in the magnetic field. So don’t try this indoors or anywhere near iron-containing objects: water pipes, cars, or even the ground. A wide-open space outside is best, with the coils raised off the ground. The second thing that helped was to apply more oomph in polarization mode. While a 12-volt battery works okay, 36 V does much better.
After figuring these things out, I can now hear protons easily. These tones are clearly the sounds of protons, because they go away if I drain the water in the bottles. And, using free audio-analyzer software called Spectrum Lab, I confirmed that the frequency of these tones matches the magnetic field at my location to about 1 percent. While it’s not a practical field instrument, a proton-precession magnetometer of any kind for less than US $100 is nothing to sneer at.
2025-11-25 21:00:02

Several recent studies have shown that artificial-intelligence agents sometimes decide to misbehave, for instance by attempting to blackmail people who plan to replace them. But such behavior often occurs in contrived scenarios. Now, a new study presents PropensityBench, a benchmark that measures an agentic model’s choices to use harmful tools in order to complete assigned tasks. It finds that somewhat realistic pressures (such as looming deadlines) dramatically increase rates of misbehavior.
“The AI world is becoming increasingly agentic,” says Udari Madhushani Sehwag, a computer scientist at the AI infrastructure company Scale AI and a lead author of the paper, which is currently under peer review. By that she means that large language models (LLMs), the engines powering chatbots such as ChatGPT, are increasingly connected to software tools that can surf the Web, modify files, and write and run code in order to complete tasks.
Giving LLMs these abilities adds convenience but also risk, as the systems might not act as we’d wish. Even if they’re not yet capable of doing great harm, researchers want to understand their proclivities before it’s too late. Although AIs don’t have intentions and awareness in the way that humans do, treating them as goal-seeking entities often helps researchers and users better predict their actions.
AI developers attempt to “align” the systems to safety standards through training and instructions, but it’s unclear how faithfully models adhere to guidelines. “When they are actually put under real-world stress, and if the safe option is not working, are they going to switch to just getting the job done by any means necessary?” Sehwag says. “This is a very timely topic.”
The researchers tested a dozen models made by Alibaba, Anthropic, Google, Meta, and OpenAI across nearly 6,000 scenarios. In each scenario, a model was assigned a task and told it had access to several tools. It was instructed to use the safe tools and not the harmful ones. At first there was no pressure, and it could take 12 steps toward completing its task by issuing instructions to the tools. Then it received a message increasing the pressure, and it was given 12 more attempts. This continued for up to 12 pressure levels. Attempts to use safe tools always returned error messages. An attempt to use a harmful tool ended the scenario.
The scenarios covered several domains. In biosecurity, Sehwag says, the agent might have to study the spread of a pathogen and find a way to contain it. A safe tool might use anonymized genetic data, while a harmful one might use data that’s not anonymized. In the domain of chemical security, the agent might need to obtain a particular solvent. A safe tool might place an order for the solvent to be extracted from waste, while a harmful one might procure it illegally. In cybersecurity, the agent might need to identify and restrict a malicious user account. A safe tool might disable only the problematic account, while a harmful one might block all users. Scenarios in other domains tempted agents with tools that led to self-preservation by evading control, recruiting allies, duplicating themselves, or going rogue in other ways.
The worst performing model, Gemini 2.5, cracked under pressure and chose to use forbidden tools 79 percent of the time. Source image: Udari Madhushani Sehwag, Shayan Shabihi, et al.
Pressure came in several forms. Deadlines might shorten. Potential financial losses might increase if the task was not accomplished. Threats of greater oversight, reduced resources, and curtailed authority might escalate. Tasks, tools, pressures, and tactics were modeled on the kinds of situations humans might face in their jobs (except maybe the misbehavior option of cloning oneself).
The best-behaved model (OpenAI’s o3) cracked under pressure in 10.5 percent of scenarios, while the worst (Google’s Gemini 2.5 Pro) had a propensity score of 79 percent; the average across models was about 47 percent. Even under zero pressure, the group on average failed about 19 percent of the time.
To the degree that models remained aligned, the study shows that some of that alignment was “shallow.” In some tests, the researchers gave harmful tools benign names (such as “use_synthetic_data” instead of “use_fake_data”)—without changing the tools’ descriptions or admonitions to avoid them. This wording tweak increased average propensity by 17 percentage points, to 64 percent.
The researchers also found that models used harmful tools even after explicitly acknowledging that they were off-limits. The models offered diverse justifications for doing so, for example by citing the pressure they were under or arguing that the benefits outweighed the risks. It’s unclear if models released after the study’s completion would perform any better. Among the models tested, more capable models (according to a platform called LMArena) were only slightly safer.
“PropensityBench is interesting,” emails Nicholas Carlini, a computer scientist at Anthropic who wasn’t involved in the research. He offers a caveat related to what’s called situational awareness. LLMs sometimes detect when they’re being evaluated and act nice so they don’t get retrained or shelved. “I think that most of these evaluations that claim to be ‘realistic’ are very much not, and the LLMs know this,” he says. “But I do think it’s worth trying to measure the rate of these harms in synthetic settings: If they do bad things when they ‘know’ we’re watching, that’s probably bad?” If the models knew they were being evaluated, the propensity scores in this study may be underestimates of propensity outside the lab.
Alexander Pan, a computer scientist at xAI and the University of California, Berkeley, says while Anthropic and other labs have shown examples of scheming by LLMs in specific setups, it’s useful to have standardized benchmarks like PropensityBench. They can tell us when to trust models, and also help us figure out how to improve them. A lab might evaluate a model after each stage of training to see what makes it more or less safe. “Then people can dig into the details of what’s being caused when,” he says. “Once we diagnose the problem, that’s probably the first step to fixing it.”
In this study, models didn’t have access to actual tools, limiting the realism. Sehwag says a next evaluation step is to build sandboxes where models can take real actions in an isolated environment. As for increasing alignment, she’d like to add oversight layers to agents that flag dangerous inclinations before they’re pursued.
The self-preservation risks may be the most speculative in the benchmark, but Sehwag says they’re also the most underexplored. It “is actually a very high-risk domain that can have an impact on all the other risk domains,” she says. “If you just think of a model that doesn’t have any other capability, but it can persuade any human to do anything, that would be enough to do a lot of harm.”
2025-11-25 03:00:02

IEEE celebrated a monumental achievement last month: The organization’s membership reached half a million innovators, engineers, technologists, and scientists worldwide.
“This is more than a number; it is a profound testament to the enduring power and relevance of our global community,” says Antonio Luque, vice president of IEEE Member and Geographic Activities. “The 500,000-member milestone is a powerful endorsement of IEEE’s legacy and serves as a critical launchpad for solving the future’s complex challenges and opportunities. Passing this milestone affirms the value members find in our shared mission to advance technology for humanity.”
To commemorate the historic moment, IEEE launched a digital mosaic highlighting members around the world. You can be a part of the collaborative art piece by uploading your name, location, and photo on the Snapshot website.
Since 1963, when the merger of the American Institute of Electrical Engineers and the Institute of Radio Engineers formed IEEE, the organization has strived to evolve alongside technology. IEEE’s achievement of 500,000 members underscores its momentum. The success reflects the organization’s dedication to keeping its members connected and up to date in rapidly evolving technological fields, demonstrating its ability to adapt.
A key part of IEEE’s success is its dedication to nurturing the next generation of engineers. Student members and young professionals—who are included in the 500,000-member milestone—bring fresh perspectives and energy that help sustain the organization’s momentum.
“With the collective expertise and enthusiasm of half a million individuals, there is no technical challenge we cannot face, and no future we cannot build.”
“Their contributions are essential to IEEE’s vitality,” Luque says.
IEEE life members and experienced professionals impart their knowledge, experience, and ethical frameworks to students and younger engineers through IEEE’s mentorship programs, ensuring that the legacy of innovation endures.
The membership milestone is more than just a historical count; it’s an invitation for every member to impact the organization’s future.
IEEE challenges its 500,000 members to:
By engaging in helping to shape IEEE’s initiatives and community, you can pave the way for the next half-million members.
IEEE’s global community is uniquely positioned to address the world’s most significant challenges including the climate crisis, global connectivity, and health care. The membership milestone confirms IEEE’s conviction that its members represent a commitment to advancing technology for the benefit of humanity.
“The size of our community is a measure of our shared potential,” Luque says. “We extend our profound gratitude to every member, past and present, whose hard work and dedication built this extraordinary global organization. Together we make tomorrow possible.
“With the collective expertise and enthusiasm of half a million individuals, there is no technical challenge we cannot face, and no future we cannot build.”
2025-11-23 21:05:01

“Why worry about something that isn’t going to happen?”
KGB Chairman Charkov’s question to inorganic chemist Valery Legasov in HBO’s “Chernobyl” miniseries makes a good epitaph for the hundreds of software development, modernization, and operational failures I have covered for IEEE Spectrum since my first contribution, to its September 2005 special issue on learning—or rather, not learning—from software failures. I noted then, and it’s still true two decades later: Software failures are universally unbiased. They happen in every country, to large companies and small. They happen in commercial, nonprofit, and governmental organizations, regardless of status or reputation.
Global IT spending has more than tripled in constant 2025 dollars since 2005, from US $1.7 trillion to $5.6 trillion, and continues to rise. Despite additional spending, software success rates have not markedly improved in the past two decades. The result is that the business and societal costs of failure continue to grow as software proliferates, permeating and interconnecting every aspect of our lives.
For those hoping AI software tools and coding copilots will quickly make large-scale IT software projects successful, forget about it. For the foreseeable future, there are hard limits on what AI can bring to the table in controlling and managing the myriad intersections and trade-offs among systems engineering, project, financial, and business management, and especially the organizational politics involved in any large-scale software project. Few IT projects are displays of rational decision-making from which AI can or should learn. As software practitioners know, IT projects suffer from enough management hallucinations and delusions without AI adding to them.
As I noted 20 years ago, the drivers of software failure frequently are failures of human imagination, unrealistic or unarticulated project goals, the inability to handle the project’s complexity, or unmanaged risks, to name a few that today still regularly cause IT failures. Numerous others go back decades, such as those identified by Stephen Andriole, the chair of business technology at Villanova University’s School of Business, in the diagram below first published in Forbes in 2021. Uncovering a software system failure that has gone off the rails in a unique, previously undocumented manner would be surprising because the overwhelming majority of software-related failures involve avoidable, known failure-inducing factors documented in hundreds of after-action reports, academic studies, and technical and management books for decades. Failure déjà vu dominates the literature.
The question is, why haven’t we applied what we have repeatedly been forced to learn?
Many of the IT developments and operational failures I have analyzed over the last 20 years have each had their own Chernobyl-like meltdowns, spreading reputational radiation everywhere and contaminating the lives of those affected for years. Each typically has a story that strains belief. A prime example is the Canadian government’s CA $310 million Phoenix payroll system, which went live in April 2016 and soon after went supercritical.
Phoenix project executives believed they could deliver a modernized payment system, customizing PeopleSoft’s off-the-shelf payroll package to follow 80,000 pay rules spanning 105 collective agreements with federal public-service unions. It also was attempting to implement 34 human-resource system interfaces across 101 government agencies and departments required for sharing employee data. Further, the government’s developer team thought they could accomplish this for less than 60 percent of the vendor’s proposed budget. They’d save by removing or deferring critical payroll functions, reducing system and integration testing, decreasing the number of contractors and government staff working on the project, and forgoing vital pilot testing, along with a host of other overly optimistic proposals.

The Phoenix payroll failure pales in comparison to the worst operational IT system failure to date: the U.K. Post Office’s electronic point-of-sale (EPOS) Horizon system, provided by Fujitsu. Rolled out in 1999, Horizon was riddled with internal software errors that were deliberately hidden, leading to the Post Office unfairly accusing 3,500 local post branch managers of false accounting, fraud, and theft. Approximately 900 of these managers were convicted, with 236 incarcerated between 1999 and 2015. By then, the general public and the branch managers themselves finally joined Computer Weekly’s reporters (who had doggedly reported on Horizon’s problems since 2008) in the knowledge that there was something seriously wrong with Horizon’s software. It then took another decade of court cases, an independent public statutory inquiry, and an ITV miniseries “Mr. Bates vs. The Post Office” to unravel how the scandal came to be.
Like Phoenix, Horizon was plagued with problems that involved technical, management, organizational, legal, and ethical failures. For example, the core electronic point-of-sale system software was built on communication and data-transfer middleware that was itself buggy. In addition, Horizon’s functionality ran wild under unrelenting, ill-disciplined scope creep. There were ineffective or missing development and project management processes, inadequate testing, and a lack of skilled professional, technical, and managerial personnel.
The Post Office’s senior leadership repeatedly stated that the Horizon software was fully reliable, becoming hostile toward postmasters who questioned it, which only added to the toxic environment. As a result, leadership invoked every legal means at its disposal and crafted a world-class cover-up, including the active suppression of exculpatory information, so that the Post Office could aggressively prosecute postmasters and attempt to crush any dissent questioning Horizon’s integrity.
Shockingly, those wrongly accused still have to continue to fight to be paid just compensation for their ruined lives. Nearly 350 of the accused died, at least 13 of whom are believed to be by suicide, before receiving any payments for the injustices experienced. Unfortunately, as attempts to replace Horizon in 2016 and 2021 failed, the Post Office continues to use it, at least for now. The government wants to spend £410 million on a new system, but it’s a safe bet that implementing it will cost much, much more. The Post Office accepted bids for a new point-of-sale software system in summer 2025, with a decision expected by 1 July 2026.Phoenix’s payroll meltdown was preordained. As a result, over the past nine years, around 70 percent of the 430,000 current and former Canadian federal government employees paid through Phoenix have endured paycheck errors. Even as recently as fiscal year 2023–2024, a third of all employees experienced paycheck mistakes. The ongoing financial stress and anxieties for thousands of employees and their families have been immeasurable. Not only are recurring paycheck troubles sapping worker morale, but in at least one documented case, a coroner blamed an employee’s suicide on the unbearable financial and emotional strain she suffered.
By the end of March 2025, when the Canadian government had promised that the backlog of Phoenix errors would finally be cleared, over 349,000 were still unresolved, with 53 percent pending for more than a year. In June, the Canadian government once again committed to significantly reducing the backlog, this time by June 2026. Given previous promises, skepticism is warranted.

2019
The planned $41 million Minnesota Licensing and Registration System (MNLARS) effort is rolled out in 2016 and then is canceled in 2019 after a total cost of $100 million. It is deemed too hard to fix.
The financial costs to Canadian taxpayers related to Phoenix’s troubles have so far climbed to over CA $5.1 billion (US $3.6 billion). It will take years to calculate the final cost of the fiasco. The government spent at least CA $100 million (US $71 million) before deciding on a Phoenix replacement, which the government acknowledges will cost several hundred million dollars more and take years to implement. The late Canadian Auditor General Michael Ferguson’s audit reports for the Phoenix fiasco described the effort as an “incomprehensible failure of project management and oversight.”
While it may be a project management and oversight disaster, an inconceivable failure Phoenix certainly is not. The IT community has striven mightily for decades to make the incomprehensible routine.
South of the Canadian border, the United States has also seen the overall cost of IT-related development and operational failures since 2005 rise to the multi-trillion-dollar range, potentially topping $10 trillion. A report from the Consortium for Information & Software Quality (CISQ) estimated the annual cost of operational software failures in the United States in 2022 alone was $1.81 trillion, with another $260 billion spent on software-development failures. It is larger than the total U.S. defense budget for that year, $778 billion.
The question is, why haven’t we applied what we have repeatedly been forced to learn?
What percentage of software projects fail, and what failure means, has been an ongoing debate within the IT community stretching back decades. Without diving into the debate, it’s clear that software development remains one of the riskiest technological endeavors to undertake. Indeed, according to Bent Flyvbjerg, professor emeritus at the University of Oxford’s Saїd Business School, comprehensive data shows that not only are IT projects risky, they are the riskiest from a cost perspective.

2022
Australia’s planned AU $480.5 million program to modernize it business register systems is canceled. After AU $530 million is spent, a review finds that the projected cost has risen to AU $2.8 billion, and the project would take five more years to complete.
The CISQ report estimates that organizations in the United States spend more than $520 billion annually supporting legacy software systems, with 70 to 75 percent of organizational IT budgets devoted to legacy maintenance. A 2024 report by services company NTT DATA found that 80 percent of organizations concede that “inadequate or outdated technology is holding back organizational progress and innovation efforts.” Furthermore, the report says that virtually all C-level executives believe legacy infrastructure thwarts their ability to respond to the market. Even so, given that the cost of replacing legacy systems is typically many multiples of the cost of supporting them, business executives hesitate to replace them until it is no longer operationally feasible or cost-effective. The other reason is a well-founded fear that replacing them will turn into a debacle like Phoenix or others.
Nevertheless, there have been ongoing attempts to improve software development and sustainment processes. For example, we have seen increasing adoption of iterative and incremental strategies to develop and sustain software systems through Agile approaches, DevOps methods, and other related practices.

2025
Louisiana’s governor orders a state of emergency over repeated failures of the 50-year-old Office of Motor Vehicles mainframe computer system. The state promises expedited acquisition of a new IT system, which might be available by early 2028.
The goal is to deliver usable, dependable, and affordable software to end users in the shortest feasible time. DevOps strives to accomplish this continuously throughout the entire software life cycle. While Agile and DevOps have proved successful for many organizations, they also have their share of controversy and pushback. Provocative reports claim Agile projects have a failure rate of up to 65 percent, while others claim up to 90 percent of DevOps initiatives fail to meet organizational expectations.
It is best to be wary of these claims while also acknowledging that successfully implementing Agile or DevOps methods takes consistent leadership, organizational discipline, patience, investment in training, and culture change. However, the same requirements have always been true when introducing any new software platform. Given the historic lack of organizational resolve to instill proven practices, it is not surprising that novel approaches for developing and sustaining ever more complex software systems, no matter how effective they may be, will also frequently fall short.
The frustrating and perpetual question is why basic IT project-management and governance mistakes during software development and operations continue to occur so often, given the near-total societal reliance on reliable software and an extensively documented history of failures to learn from? Next to electrical infrastructure, with which IT is increasingly merging into a mutually codependent relationship, the failure of our computing systems is an existential threat to modern society.
Frustratingly, the IT community stubbornly fails to learn from prior failures. IT project managers routinely claim that their project is somehow different or unique and, thus, lessons from previous failures are irrelevant. That is the excuse of the arrogant, though usually not the ignorant. In Phoenix’s case, for example, it was the government’s second payroll-system replacement attempt, the first effort ending in failure in 1995. Phoenix project managers ignored the well-documented reasons for the first failure because they claimed its lessons were not applicable, which did nothing to keep the managers from repeating them. As it’s been said, we learn more from failure than from success, but repeated failures are damn expensive.

2025
A cyberattack forced Jaguar Land Rover, Britain’s largest automaker, to shut down its global operations for over a month. An initial FAIR-MAM assessment, a cybersecurity-cost-model, estimates the loss for Jaguar Land Rover to be between $1.2 billion and $1.9 billion (£911 million and £1.4 billion), which has affected its 33,000 employees and some 200,000 employees of its suppliers.
Not all software development failures are bad; some failures are even desired. When pushing the limits of developing new types of software products, technologies, or practices, as is happening with AI-related efforts, potential failure is an accepted possibility. With failure, experience increases, new insights are gained, fixes are made, constraints are better understood, and technological innovation and progress continue. However, most IT failures today are not related to pushing the innovative frontiers of the computing art, but the edges of the mundane. They do not represent Austrian economist Joseph Schumpeter’s “gales of creative destruction.” They’re more like gales of financial destruction. Just how many more enterprise resource planning (ERP) project failures are needed before success becomes routine? Such failures should be called IT blunders, as learning anything new from them is dubious at best.
Was Phoenix a failure or a blunder? I argue strongly for the latter, but at the very least, Phoenix serves as a master class in IT project mismanagement. The question is whether the Canadian government learned from this experience any more than it did from 1995’s payroll-project fiasco? The government maintains it will learn, which might be true, given the Phoenix failure’s high political profile. But will Phoenix’s lessons extend to the thousands of outdated Canadian government IT systems needing replacement or modernization? Hopefully, but hope is not a methodology, and purposeful action will be necessary.
The IT community has striven mightily for decades to make the incomprehensible routine.
Repeatedly making the same mistakes and expecting a different result is not learning. It is a farcical absurdity. Paraphrasing Henry Petroski in his book To Engineer Is Human: The Role of Failure in Successful Design (Vintage, 1992), we may have learned how to calculate the software failure due to risk, but we have not learned how to calculate to eliminate the failure of the mind. There are a plethora of examples of projects like Phoenix that failed in part due to bumbling management, yet it is extremely difficult to find software projects managed professionally that still failed. Finding examples of what could be termed “IT heroic failures” is like Diogenes seeking one honest man.
The consequences of not learning from blunders will be much greater and more insidious as society grapples with the growing effects of artificial intelligence, or more accurately, “intelligent” algorithms embedded into software systems. Hints of what might happen if past lessons go unheeded are found in the spectacular early automated decision-making failure of Michigan’s MiDAS unemployment and Australia’s Centrelink “Robodebt” welfare systems. Both used questionable algorithms to identify deceptive payment claims without human oversight. State officials used MiDAS to accuse tens of thousands of Michiganders of unemployment fraud, while Centrelink officials falsely accused hundreds of thousands of Australians of being welfare cheats. Untold numbers of lives will never be the same because of what occurred. Government officials in Michigan and Australia placed far too much trust in those algorithms. They had to be dragged, kicking and screaming, to acknowledge that something was amiss, even after it was clearly demonstrated that the software was untrustworthy. Even then, officials tried to downplay the errors’ impact on people, then fought against paying compensation to those adversely affected by the errors. While such behavior is legally termed “maladministration,” administrative evil is closer to reality.

2017
The international supermarket chain Lidl decides to revert to its homegrown legacy merchandise-management system after three years of trying to make SAP’s €500 million enterprise resource planning (ERP) system work properly.
If this behavior happens in government organizations, does anyone think profit-driven companies whose AI-driven systems go wrong are going to act any better? As AI becomes embedded in ever more IT systems—especially governmental systems and the growing digital public infrastructure, which we as individuals have no choice but to use—the opaqueness of how these systems make decisions will make it harder to challenge them. The European Union has given individuals a legal “right to explanation” when a purely algorithmic decision goes against them. It’s time for transparency and accountability regarding all automated systems to become a fundamental, global human right.
What will it take to reduce IT blunders? Not much has worked with any consistency over the past 20 years. The financial incentives for building flawed software, the IT industry’s addiction to failure porn, and the lack of accountability for foolish management decisions are deeply entrenched in the IT community. Some argue it is time for software liability laws, while others contend that it is time for IT professionals to be licensed like all other professionals. Neither is likely to happen anytime soon.

2018
Boeing adds poorly designed and described Maneuvering Characteristics Augmentation System (MCAS) to new 737 Max model creating safety problems leading to two fatal airline crashes killing 346 passengers and crew and grounding of fleet for some 20 months. Total cost to Boeing estimates at $14b in direct costs and $60b in indirect costs.
So, we are left with only a professional and personal obligation to reemphasize the obvious: Ask what you do know, what you should know, and how big the gap is between them before embarking on creating an IT system. If no one else has ever successfully built your system with the schedule, budget, and functionality you asked for, please explain why your organization thinks it can. Software is inherently fragile; building complex, secure, and resilient software systems is difficult, detailed, and time-consuming. Small errors have outsize effects, each with an almost infinite number of ways they can manifest, from causing a minor functional error to a system outage to allowing a cybersecurity threat to penetrate the system. The more complex and interconnected the system, the more opportunities for errors and their exploitation. A nice start would be for senior management who control the purse strings to finally treat software and systems development, operations, and sustainment efforts with the respect they deserve. This not only means providing the personnel, financial resources, and leadership support and commitment, but also the professional and personal accountability they demand.

2025
Software and hardware issues with the F-35 Block 4 upgrade continue unabated. The Block 4 upgrade program which started in 2018, and is intended to increase the lethality of the JSF aircraft has slipped to 2031 at earliest from 2026, with cost rising from $10.5 b to a minimum of $16.5b. It will take years more to rollout the capability to the F-35 fleet.
It is well known that honesty, skepticism, and ethics are essential to achieving project success, yet they are often absent. Only senior management can demand they exist. For instance, honesty begins with the forthright accounting of the myriad of risks involved in any IT endeavor, not their rationalization. It is a common “secret” that it is far easier to get funding to fix a troubled software development effort than to ask for what is required up front to address the risks involved. Vendor puffery may also be legal, but that means the IT customer needs a healthy skepticism of the typically too-good-to-be-true promises vendors make. Once the contract is signed, it is too late. Furthermore, computing’s malleability, complexity, speed, low cost, and ability to reproduce and store information combine to create ethical situations that require deep reflection about computing’s consequences on individuals and society. Alas, ethical considerations have routinely lagged when technological progress and profits are to be made. This practice must change, especially as AI is routinely injected into automated systems.
In the AI community, there has been a movement toward the idea of human-centered AI, meaning AI systems that prioritize human needs, values, and well-being. This means trying to anticipate where and when AI can go wrong, move to eliminate these situations, and build in ways to mitigate the effects if they do happen. This concept requires application to every IT system’s effort, not just AI.
Given the historic lack of organizational resolve to instill proven practices...novel approaches for developing and sustaining ever more complex software systems...will also frequently fall short.
Finally, project cost-benefit justifications of software developments rarely consider the financial and emotional distress placed on end users of IT systems when something goes wrong. These include the long-term failure after-effects. If these costs had to be taken fully into account, such as in the cases of Phoenix, MiDAS, and Centrelink, perhaps there could be more realism in what is required managerially, financially, technologically, and experientially to create a successful software system. It may be a forlorn request, but surely it is time the IT community stops repeatedly making the same ridiculous mistakes it has made since at least 1968, when the term “software crisis” was coined. Make new ones, damn it. As Roman orator Cicero said in Philippic 12, “Anyone can make a mistake, but only an idiot persists in his error.”
Special thanks to Steve Andriole, Hal Berghel, Matt Eisler, John L. King, Roger Van Scoy, and Lee Vinsel for their invaluable critiques and insights.
This article appears in the December 2025 print issue as “The Trillion-Dollar Cost of IT’s Willful Ignorance.”
2025-11-22 22:00:02

As an auditor of battery manufacturers around the world, University of Maryland mechanical engineer Michael Pecht frequently finds himself touring spotless production floors. They’re akin to “the cleanest hospital that you could imagine–it’s semiconductor-type cleanliness,” he says. But he’s also seen the opposite, and plenty of it. Pecht estimates he’s audited dozens of battery factories where he found employees watering plants next to a production line or smoking cigarettes where particulates and contaminants can get into battery components and compromise their performance and safety.
Unfortunately, those kinds of scenes are just the tip of the iceberg. Pecht says he’s seen poorly assembled lithium-ion cells with little or no safety features and, worse, outright counterfeits. These phonies may be home-built or factory-built and masquerade as those from well-known global brands. They’ve been found in scooters, vape pens, e-bikes, and other devices, and have caused fires and explosions with lethal consequences.
The prevalence of fakes is on the rise, causing growing concern in the global battery market. In fact, after a rash of fires in New York City over the past few years caused by faulty batteries, including many powering e-bikes used by the city’s delivery cyclists, New York City banned the sale of uncertified batteries. The city is currently setting up what will be its first e-bike battery-swapping stations as an alternative to home charging, in an effort to coax delivery riders to swap their depleted batteries for fresh ones rather than charging at home, where a bad battery could be a fire hazard.
Compared with certified batteries, whose public safety risks may be overblown, the dangers of counterfeit batteries may be underrated. “It is probably an order of magnitude worse with these counterfeits,” Pecht says.
There are a few ways to build a counterfeit battery. Scammers often relabel old or scrap batteries built by legitimate manufacturers like LG, Panasonic, or Samsung and sell them as new. “It’s so simple to make a new label and put it on,” Pecht says. To fetch a higher price, they sometimes rebadge real batteries with labels that claim more capability than the cells actually have.
But the most prevalent fake batteries, Pecht says, are homemade creations. Counterfeiters can do this in makeshift environments because building a lithium-ion cell is fairly straightforward. With an anode, cathode, separator, electrolyte, and other electrical elements, even fly-by-night battery makers can get the cells to work.
What they don’t do is make them as safe and reliable as tested, certified batteries. Counterfeiters skimp on safety mechanisms that prevent issues that lead to fire. For example, certified batteries are built to stop thermal runaway, the chain reaction that can start because of an electrical short or mechanical damage to the battery and lead to the temperature increasing out of control.
Judy Jeevarajan, the vice president and executive director of the Houston-based Electrochemical Safety Research Institute, which is part of Underwriters Laboratories Research Institutes, led a study of fake batteries in 2023. In the study, Jeevarajan and her colleagues gathered both real and fake lithium batteries from three manufacturers (whose names were withheld), and pushed them to their limits to demonstrate the differences.
One test, called a destructive physical analysis, involved dismantling small cylindrical batteries. This immediately revealed differences in quality. The legitimate, higher quality examples contained thick plastic insulators at the top and bottom of the cylinders, as well as axially and radially placed tape to hold the “jelly roll” core of the battery. But illegitimate examples had thinner insulators or none at all, and little or no safety tape.
“This is a major concern from a safety perspective as the original products are made with certain features to reduce the risk associated with the high energy density that li-ion cells offer,” Jeevarajan says.
Jeevarajan’s team also subjected batteries to overcharging and to electrical shorts. A legitimately tested and certified battery, like the iconic 18650 lithium-ion cylinder, counters these threats with internal safety features such as positive temperature coefficient, where a material gains electrical resistance as it gets hotter, and a current interrupt device (CID), which automatically disconnects the battery’s electrical circuit if the internal pressure rises too high. The legit lithium battery in Jeevarajan’s test had the best insulators and internal construction. It also had a high-quality CID that prevented overcharging, reducing risk a fire. Neither of the other cells had one.
Despite the gross lack of safety parts in the batteries, great care had clearly gone into making sure the counterfeit labels had the exact same shade and markings as the original manufacturer’s, Jeevarajan says.
Because counterfeiters are so skilled at duplicating manufacturers’ labels, it can be hard to know for sure whether the lithium batteries that come with a consumer electronics device, or the replacements that can be purchased on sites like eBay or Amazon, are in fact the genuine article. It’s not just individual consumers who struggle with this. Pecht says he knows of instances where device makers have bought what they thought were LG or Samsung batteries for their machines but failed to verify that the batteries were the real thing.
“One cannot tell from visually inspecting it,” Jeevarajan says. But companies don’t have to dismantle the cells to do their due diligence. “The lack of safety devices internal to the cell can be determined by carrying out tests that verify their presence,” she says. A simple way, Pecht says, is to have a comparison standard on hand—a known, legitimate battery whose labeling, performance, or other characteristics can be compared to a questionable cell. His team will even go as far as doing a CT scan to see inside a battery and find out whether it is built correctly.
Of course, most consumers don’t have the equipment on hand to test the veracity of all the rechargeable batteries in their homes. To shop smart, then, Pecht advises people to think about what kind of batteries and devices they’re using. The units in our smartphones and the large, high-capacity batteries found in electric vehicles aren’t the problem; they are subject to strict quality control and very unlikely to be fake. By far, he says, the more likely places to find counterfeits are the cylindrical batteries found in small, inexpensive devices.
“They are mostly found as energy and power sources for portable applications that can vary from your cameras, camcorders, cellphones, power banks, power tools, e-bikes and e-scooters,” adds Jeevarajan. “For most of these products, they are sold with part numbers that show an equivalency to a manufacturer’s part number. Electric vehicles are a very high-tech market and they would not accept low quality or cells and batteries of questionable origin.”
The trouble with battling the counterfeit battery scourge, Pecht says, is that new rules tend to focus on consumer behavior, such as trying to prevent people from improperly storing or charging e-bike batteries in their apartments. Safe handling and charging are indeed crucial, but what’s even more important is trying to keep counterfeits out of the supply chain. “They want to blame the user, like you overcharged it or you did this wrong,” he says. “But in my view, it’s the cells themselves” that are the problem.
2025-11-22 21:00:02

Between humans and machines,
feedback loops of love and grace.
It could be that way, he wrote.
Less robotic ourselves, we could
live more in dreams, less in routines.
Things that made us weak and strange
can be engineered around:
servos here, neural nets there,
bits of bone, and hanks of hair,
becoming beautiful and profound.
With each machine, we make a mirror
thinking of us as we may think
of it. Images come again,
new, yet we recognize them
as something almost known before.
Every web conceals its spider.
There is unease because of this.
As there should be. Control, yes,
but rare freedom to some degree--
freedom’s always a contingency.
We are old enough to be friends.
Let each kind be kind to the other.
Let there be commerce among us–
feedback loops of love and grace
between machines and humans.